Re: [sqlite] Performance degradation of in-memory database with high row count and overall design discussion of my project

2017-01-06 Thread James K. Lowden
On Thu, 5 Jan 2017 18:56:13 +0100
Simone Mosciatti  wrote:

> I could store in some buffer some SQL statement and I could execute
> all of them in one single transaction.

That's a good idea, all things being equal.  But are they equal?  Your
problem description is entirely technical.  It doesn't address how
transactions map onto the problem domain.  

Question you should ask yourself:

1.  What is the unit of work?  What combinations of operations
constitute a logically consistent change to the database?  

2.  How many units of work are you prepared to lose in the event of
system failure?  That determines how often you commit.  

> 2) I need some way to indicate if an error occurs during a write 
> operation, which is a problem I am not able to solve in a reasonable 
> bound of complexity.

I'm not sure I understand.  SQLite will commit the transaction
atomically.  If the transaction fails for logical or physical reasons,
the error is indicated at commit time.  How is indicating that error to
the user in any way difficult?  

--jkl

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance degradation of in-memory database with high row count and overall design discussion of my project

2017-01-05 Thread Dan Kennedy

On 01/06/2017 12:56 AM, Simone Mosciatti wrote:

Hi all,

I am writing a small Redis module that embeds SQLite [1]

I am facing some performance issues and I wonder if the behaviour I am 
seeing is expected or if I am doing something completely wrong.



However I am finding some issues on the performance of the in memory 
database, and I see the insert performance decrease, quite fast, as 
more and more tuple are added. I run some test and I would like to 
understand if this behaviour is expected.


I collect the result of my benchmark on a github issues [2] where you 
can see that the performance decrease in a more than linear way from 
roughly 24.000 inserts per second to stabilize at roughly 4.000 
inserts per second.


The actual number of inserts per second is pretty pointless, it 
depends on the hardware, on the resource of the machine, etc... what 
leaves me wonder is the fact that I was expecting a logarithmic 
decrease in performance and not a (at least) linear decrease.


Can somebody confirm that this is normal?


Can you describe the performance test you are running?

If I create a table in an in-memory database with:

  CREATE TABLE test (a INT, b INT, c INT);

Then run this:

  INSERT INTO test VALUES(random(), random(), random())

10,000,000 times, I get a fairly stable 330,000 inserts per second or 
so. Tcl code to do this below. What is your test doing differently?


Dan.


## START TCL CODE
   package require sqlite3

   sqlite3 db :memory:
   db eval { CREATE TABLE test (a INT, b INT, c INT); }

   proc insert_n_rows {n} {
 for {set i 0} {$i<$n} {incr i} {
   db eval { INSERT INTO test VALUES(random(), random(), random()) }
 }
   }

   set nStep 10

   for {set i 0} {$i < 100} {incr i} {
 set us [lindex [time { insert_n_rows $nStep }] 0]
 puts "[expr $i*$nStep] [format %.2f [expr (100.0 * $nStep) / 
$us]]/sec"

   }
## END TCL CODE

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Performance degradation of in-memory database with high row count and overall design discussion of my project

2017-01-05 Thread Simone Mosciatti

Hi all,

I am writing a small Redis module that embeds SQLite [1]

I am facing some performance issues and I wonder if the behaviour I am 
seeing is expected or if I am doing something completely wrong.


I would provide some context on what I am doing, hoping that you may see 
some flaw in my reasoning and point me in a better direction, I am sorry 
for the long email though.


The main problem I am trying to solve is that SQLite when writing on 
disk is slow; this completely expected, I am using a spinning disk and 
we need to hit the disk for every commit (at least with the default 
PRAGMA setting).


However I am trying to figure out how to make it faster.

The very first thing that comes to mind is to avoid doing a lot of 
transaction, I could store in some buffer some SQL statement and I could 
execute all of them in one single transaction.


This is perfect from the performance point of view, but two problem arise:

1) I need to differentiate between SELECT operation and 
DELETE/INSERT/UPDATE operations, I could manage this part.


2) I need some way to indicate if an error occurs during a write 
operation, which is a problem I am not able to solve in a reasonable 
bound of complexity.



The second option I am considering is to let Redis takes care of the 
persistence. Use SQLite as in-memory and periodically write a snapshot 
on the RDB file (the RDB file is a snapshot of the whole internal state 
of Redis).


This approach would works fine because I would be fast enough to COMMIT 
every operation, so I could provide immediate feedback in the case of an 
error, but at the same time I could provide enough throughput and the 
data will be written on disk "reasonably" often.


However I am finding some issues on the performance of the in memory 
database, and I see the insert performance decrease, quite fast, as more 
and more tuple are added. I run some test and I would like to understand 
if this behaviour is expected.


I collect the result of my benchmark on a github issues [2] where you 
can see that the performance decrease in a more than linear way from 
roughly 24.000 inserts per second to stabilize at roughly 4.000 inserts 
per second.


The actual number of inserts per second is pretty pointless, it depends 
on the hardware, on the resource of the machine, etc... what leaves me 
wonder is the fact that I was expecting a logarithmic decrease in 
performance and not a (at least) linear decrease.


Can somebody confirm that this is normal?

Given the background I provide, could you suggest a better way to 
achieve reasonable performance and reasonable data safeness?



[1]: https://github.com/RedBeardLab/rediSQL

[2]: https://github.com/RedBeardLab/rediSQL/issues/13#issuecomment-270449852

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users