[sqlite] Please test the latest SQLite snapshot

2012-11-09 Thread Richard Hipp
Please test the latest snapshot of SQLite in your applications and report
any performance regressions to this mailing list, or directly to me at
d...@sqlite.org.   You can get a tarball or ZIP archive of the latest raw
sources from Fossil at http://www.sqlite.org/src/info/trunk or you can pull
down a snapshot of a recent amalgamation from:

http://www.sqlite.org/download.html
http://www.sqlite.org/sqlite-amalgamation-201211092140.zip

We have made many improvements to the SQLite query planner/optimizer since
the last release.  These enhancements should help to make most applications
run faster.  But coding a query planner is more art than science, and so
there is the possibility that some of the recent enhancements have made
some corner-case queries run slower.  Unfortunately, the only way for us to
identify those cases is to test SQLite in as many applications as possible.

So, please do test the latest SQLite amalgamation in our applications, and
please let us know if you encounter any performance regressions.  We cannot
fix performance problems if we do not know about them.  Please do NOT wait
until an official release to run this test - it will be too late then.

FWIW, this email is being typed into Gmail running on a Firefox Nightly
that was recompiled with the SQLite amalgamation snapshot shown above.  And
the Fossil binary that you see at http://www.sqlite.org/src/timeline is
also running the latest SQLite.  (See
http://en.wikipedia.org/wiki/Eating_your_own_dog_food)  But we still need
your help to test the latest SQLite enhancements in as many applications as
possible.

Thanks

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] problem with migration from 1.0.66 to 1.0.82 (DateTimeKind)

2012-11-09 Thread Joe Mistachkin

Olexandr Kravets wrote:
>  
> Some columns in
> database are used to store date and time in UTC and some in local.
> 

You are storing both date/time values in both local time and UTC time in the
same database?  This is problematic, see below.

> 
> I came to following conclusion. If I migrate
> to 1.0.82, then I definitely need to specify DateTimeKind and make it
equal to
> Utc.
> 

By default, the DateTimeKind value used is Unspecified.  When storing
date/times
from multiple time zones (i.e. including UTC and local time), it is highly
recommended that all DateTime values should be stored as UTC and then the
DateTimeKind.Utc value should be used when opening the database connection.

> 
> Is it possible to find some information, why date/time
> format was changed?
> 

Prior to the changes that added DateTimeKind, there was no reliable way to
get
consistently correct DateTime values out of a column, even if the values
were
correctly stored using UTC.

This is partially due to the fact that the .NET Framework itself does not
attempt
to automatically convert or recognize DateTime values to any particular time
zone
when using the DateTime.Parse or DateTime.TryParse methods, including local
time.
Also, it will not, by itself, recognize a trailing "Z" to indicate UTC time
in
these situations.  Instead, the DateTimeKind value must be used with the
SpecifyKind
method to construct a DateTime value with the appropriate time zone, which
can
only be UTC or local time.

Therefore, if the application requires date/time values from multiple time
zones,
the only reasonable choice is storing everything in UTC time, using
DateTimeKind.Utc
for the database connection, and converting to local time for the user
interface.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help to create view of my data

2012-11-09 Thread Simon Slavin

On 9 Nov 2012, at 4:16pm, Clemens Ladisch  wrote:

> Just use SQL to generate the SQL query:
> 
> SELECT 'SELECT units.id AS "unit id"' AS SQL
> UNION ALL
> SELECT ', cv' || id || '.value AS ' || quote(name) FROM configtypes
> UNION ALL
> SELECT 'FROM units JOIN unit_types ON units.unit_type_id = unit_types.id'
> UNION ALL
> SELECT 'LEFT JOIN (unittype_config JOIN configvalues' ||
>   ' ON unittype_config.configvalue_id = configvalues.id' ||
>   ' AND unittype_config.configtype_id = ' || id || ') AS cv' || id ||
>   ' ON unit_types.id = cv' || id || '.unit_id' FROM configtypes

I'm sending you my psychiatrist's bill.

In answer to the original query, my instinct is to say that you should be using 
your own software to collect the results of several SELECTs, not trying to do 
everything inside a database engine.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] problem with migration from 1.0.66 to 1.0.82 (DateTimeKind)

2012-11-09 Thread Olexandr Kravets

Hello all,


Currently, I am using
version 1.0.66 SQLite.NET and all datetime columns by default have following
format:

 

2012-11-05
16:05:10.3702633

 

Some columns in
database are used to store date and time in UTC and some in local.

Therefore, there is no
way to see if column saved UTC time or local.

 

Currently, following test
case pass:

   DateTime CheckedAt = new
DateTime(2000, 1, 1, 3, 44,
55).ToUniversalTime();

   // Save to database

   // 2000-01-01 08:44:55

   // Retrieve from db {1/1/2000 8:44:55 AM} basically like
line below

   DateTime DateTimeFromDB = new
DateTime(2000, 1, 1, 8, 44, 55, DateTimeKind.Unspecified);

   Assert.AreEqual(CheckedAt.ToString(),
DateTimeFromDB.ToString(), "Retrieved date and
time does not match to stored.");

 

After I tried to run
the same test with 1.0.82 and in database I’ve got different result … It failed.

   DateTime CheckedAt = new
DateTime(2000, 1, 1, 3, 44,
55).ToUniversalTime();

   // Save to database

   // 2000-01-01 08:44:55Z

   // Retrieve from db {1/1/2000 3:44:55 AM} basically like
line below

   DateTime DateTimeFromDB = new
DateTime(2000, 1, 1, 3, 44, 55, DateTimeKind.Unspecified);

   Assert.AreEqual(CheckedAt.ToString(),
DateTimeFromDB.ToString(), "Retrieved date and
time does not match to stored.");

 

Looks like, SQLite
1.0.82 returns date and time as local time by default. Also time stored in DB
with Z at the end.

Of course, I found that I could specify in
connection string DateTimeKind
= DateTimeKind.Utc. After that test is passed.

 

I came to following conclusion. If I migrate
to 1.0.82, then I definitely need to specify DateTimeKind and make it equal to
Utc.

After that I will have no issues with time,
which I stored and used as UTC before. Local time will be returned to me as 
UTC, but
because

I don’t make any conversion with it, I will
see it on UI as is, which will be matching to current behavior.

 

Am I correct?

Is it possible to find some information, why date/time
format was changed?

 

Thank you,

Olexandr  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help to create view of my data

2012-11-09 Thread Clemens Ladisch
Clemens Ladisch wrote:
> it is not possible to generate columns automatically from data values.

"All problems in computer science can be solved by another level of
indirection."  (David Wheeler)

Just use SQL to generate the SQL query:

SELECT 'SELECT units.id AS "unit id"' AS SQL
UNION ALL
SELECT ', cv' || id || '.value AS ' || quote(name) FROM configtypes
UNION ALL
SELECT 'FROM units JOIN unit_types ON units.unit_type_id = unit_types.id'
UNION ALL
SELECT 'LEFT JOIN (unittype_config JOIN configvalues' ||
   ' ON unittype_config.configvalue_id = configvalues.id' ||
   ' AND unittype_config.configtype_id = ' || id || ') AS cv' || id ||
   ' ON unit_types.id = cv' || id || '.unit_id' FROM configtypes


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Help to create view of my data

2012-11-09 Thread Clemens Ladisch
Daniel Polski wrote:
> I'm redesigning parts of my database to normalize it better [...]
> The SELECT query should return all units, and for each configuration
> option it should return either the stored value (if available) or null
> if no such option exist for the unit.

SQLite does not have pivot functions; it is not possible to generate
columns automatically from data values.

In this query, the three configuration options are hardcoded:

SELECT units.id AS "unit id",
   cv1.value AS "General option",
   cv2.value AS "Color",
   cv3.value AS "Number"
FROM units
 JOIN unit_types ON units.unit_type_id = unit_types.id
 LEFT JOIN (unittype_config
JOIN configvalues
ON unittype_config.configvalue_id = configvalues.id
AND unittype_config.configtype_id = 1) AS cv1
 ON unit_types.id = cv1.unit_id
 LEFT JOIN (unittype_config
JOIN configvalues
ON unittype_config.configvalue_id = configvalues.id
AND unittype_config.configtype_id = 2) AS cv2
 ON unit_types.id = cv2.unit_id
 LEFT JOIN (unittype_config
JOIN configvalues
ON unittype_config.configvalue_id = configvalues.id
AND unittype_config.configtype_id = 3) AS cv3
 ON unit_types.id = cv3.unit_id

Are you sure that the normalization hasn't gone a *little* bit too far?

> FOREIGN KEY( unit_id ) REFERENCES unit_types ( id )

Why is this column not called unit_type_id?


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Help to create view of my data

2012-11-09 Thread Daniel Polski

Hello,
I'm redesigning parts of my database to normalize it better and to make 
a more general solution available for other parts of the system.


My goal now is to create a view which will let me do a select in the 
style of:


SELECT (unit id) (general option value) (color value) (number value) 
FROM unit_view;


The SELECT query should return all units, and for each configuration 
option it should return either the stored value (if available) or null 
if no such option exist for the unit.
With the test data in the database I provide below I would like the 
query to return:


unit id general option colornumber
1,1,null,   2
2,1,1,null

Anyone who knows how to create such view?

Thank you in advance,
/Daniel

---

PRAGMA FOREIGN_KEYS=ON;

CREATE TABLE unit_types (
idinteger primary key,
namevarchar(20) NOT NULL
);

CREATE TABLE units (
idinteger primary key,
unit_type_idint not null,
namevarchar(20) NOT NULL,
CONSTRAINT "FK_units" FOREIGN KEY( unit_type_id ) REFERENCES 
unit_types ( id )

);

CREATE TABLE configtypes(
idinteger primary key,
namevarchar(20) NOT NULL
);

CREATE TABLE configvalues(
idinteger primary key,
namevarchar(20) NOT NULL,
valueint NOT NULL
);

CREATE TABLE unittype_config(
unit_idint not null,
configtype_idint not null,
configvalue_idint not null,

CONSTRAINT "PK_uc" PRIMARY KEY (unit_id, configtype_id, 
configvalue_id),
CONSTRAINT "FK_uc_001" FOREIGN KEY( unit_id ) REFERENCES unit_types 
( id ),
CONSTRAINT "FK_uc_002" FOREIGN KEY( configtype_id ) REFERENCES 
configtypes ( id ),
CONSTRAINT "FK_uc_003" FOREIGN KEY( configvalue_id ) REFERENCES 
configvalues ( id )

);

--create test data
INSERT INTO unit_types VALUES(1, "Number unit");
INSERT INTO unit_types VALUES(2, "Color unit");

INSERT INTO configtypes VALUES(1, "General option");
INSERT INTO configtypes VALUES(2, "Color");
INSERT INTO configtypes VALUES(3, "Number");

INSERT INTO configvalues VALUES(1, "Yes", 1);
INSERT INTO configvalues VALUES(2, "No", 0);
INSERT INTO configvalues VALUES(3, "Blue", 0);
INSERT INTO configvalues VALUES(4, "Red", 1);
INSERT INTO configvalues VALUES(5, "Green", 2);
INSERT INTO configvalues VALUES(6, "1", 1);
INSERT INTO configvalues VALUES(7, "2", 2);

INSERT INTO unittype_config VALUES(1, 1, 1);
INSERT INTO unittype_config VALUES(1, 3, 7);

INSERT INTO unittype_config VALUES(2, 1, 1);
INSERT INTO unittype_config VALUES(2, 2, 4);

INSERT INTO units values(1, 1, "Unit 1");
INSERT INTO units values(2, 2, "Unit 2");

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

2012-11-09 Thread O'Toole, Eamonn
Thanks again Michael.

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Black, Michael (IS)
> Sent: 09 November 2012 15:07
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
> 
> What I would do is find the max length of your data fields.
> Then dump the database, change the create table to use default values
> at those string lengths.
> Import it.
> 
> See what that does for you.  Or just reload your data the way you've
> been doing with the new default string lengths.
> 
> Also a compound index on name/created_at could help you a lot.
> 
> And since you're in a trigger I'm not sure if the insert or replace
> would help since that's essentially what you're doing anyways and I
> believe that's all wrapped inside a transaction inside triggers.
> Somebody please correct me if this is not true.
> 
> 
> 
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Advanced GEOINT Solutions Operating Unit
> Northrop Grumman Information Systems
> 
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org]
> on behalf of O'Toole, Eamonn [eamonn.oto...@hp.com]
> Sent: Friday, November 09, 2012 8:53 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
> 
> > -Original Message-
> > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> > boun...@sqlite.org] On Behalf Of Black, Michael (IS)
> > Sent: 09 November 2012 14:26
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] Anomalously slow performance on updates to
> > earlyentries in a DB
> >
> > >[O'Toole, Eamonn] This definitely sounds like it could be an issue.
> > There is just one container_stat entry >confirmed by sqlite3_analyzer
> > output which I'll post later).  So you're saying that the single
> > container_stat table :entry is potentially being relocated very
> > frequently the closer the update is to the beginning of the db?
> >
> > Yesyou didn't say what your data flow is...but since it's
> account-
> > based I assume you have a bunch of accounts that get preloaded.
> [O'Toole, Eamonn] In Swift the db is used to store information on the
> containers.  The hierarchy in Swift is account->container->object.  The
> test targets a specific container of a specific account, which contains
> information on 10 million objects, and changes the "created_at" field
> for the first 2 million entries.  First we create the container, add
> the 10 million entries to it, then we start the update cycle.  We see
> this slow performance on every update run, although if you run updates
> in succession without any break in between runs you do see an
> improvement in performance due to caching.
> 
> 
> > So the first n-thousand records are Size1.
> > You then start updating each of those...none of them are big
> > enough...the new records get inserted at the first available empty
> slot
> > (is that actually how this works or is there another row allocation
> > strategy?).
> [O'Toole, Eamonn] The SQL transactions are done through python.  This
> is the relevant section of code that deals with the object table, the
> container_stat table is updated by the triggers:
> 
> for rec in item_list:
> query = '''
> DELETE FROM object
> WHERE name = ? AND (created_at < ?)
> '''
> if self.get_db_version(conn) >= 1:
> query += ' AND deleted IN (0, 1)'
> conn.execute(query, (rec['name'], rec['created_at']))
> query = 'SELECT 1 FROM object WHERE name = ?'
> if self.get_db_version(conn) >= 1:
> query += ' AND deleted IN (0, 1)'
> if not conn.execute(query, (rec['name'],)).fetchall():
> conn.execute('''
> INSERT INTO object (name, created_at, size,
> content_type, etag, deleted)
> VALUES (?, ?, ?, ?, ?, ?)
> ''', ([rec['name'], rec['created_at'], rec['size'],
>   rec['content_type'], rec['etag'],
> rec['deleted']]))
> 
> 
> > Now you go to update those records again...they can't fit in the 1st
> > block...and some percentage of the data won't fit into the 2nd block
> > (depends on the variability in size).  So, let's say half the records
> > get relocatedeventually you reach homeostasis.
> >
> > Also...disk fragmentation could be affecting you too but I wouldn't
> > expect an order of magnitude difference on that.
> >
> [O'Toole, Eamonn] Disk fragmentation is definitely a factor, but as you
> say defragging doesn't get you an order of magnitude improvement.
> > Are you on Windows or Unix?
> [O'Tool

Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

2012-11-09 Thread O'Toole, Eamonn
Thanks again Clemens.

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Clemens Ladisch
> Sent: 09 November 2012 15:08
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
> 
> O'Toole, Eamonn wrote:
> > DELETE FROM object
> > WHERE name = ? AND (created_at < ?)
> 
> This needs an index with "name" as the *first* column.
> 
> > query += ' AND deleted IN (0, 1)'
> 
> If the name column is (nearly) unique, putting indexes on the
> created_at
> or deleted columns is not useful (for this query).

[O'Toole, Eamonn] The name column is guaranteed to be unique.
> 
> See  for an explanation of how
> indexes work.
> 
> 
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

2012-11-09 Thread Clemens Ladisch
O'Toole, Eamonn wrote:
> DELETE FROM object
> WHERE name = ? AND (created_at < ?)

This needs an index with "name" as the *first* column.

> query += ' AND deleted IN (0, 1)'

If the name column is (nearly) unique, putting indexes on the created_at
or deleted columns is not useful (for this query).

See  for an explanation of how
indexes work.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

2012-11-09 Thread Black, Michael (IS)
What I would do is find the max length of your data fields.
Then dump the database, change the create table to use default values at those 
string lengths.
Import it.

See what that does for you.  Or just reload your data the way you've been doing 
with the new default string lengths.

Also a compound index on name/created_at could help you a lot.

And since you're in a trigger I'm not sure if the insert or replace would help 
since that's essentially what you're doing anyways and I believe that's all 
wrapped inside a transaction inside triggers.  Somebody please correct me if 
this is not true.



Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of O'Toole, Eamonn [eamonn.oto...@hp.com]
Sent: Friday, November 09, 2012 8:53 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Anomalously slow performance on updates to 
earlyentries in a DB

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Black, Michael (IS)
> Sent: 09 November 2012 14:26
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
>
> >[O'Toole, Eamonn] This definitely sounds like it could be an issue.
> There is just one container_stat entry >confirmed by sqlite3_analyzer
> output which I'll post later).  So you're saying that the single
> container_stat table :entry is potentially being relocated very
> frequently the closer the update is to the beginning of the db?
>
> Yesyou didn't say what your data flow is...but since it's account-
> based I assume you have a bunch of accounts that get preloaded.
[O'Toole, Eamonn] In Swift the db is used to store information on the 
containers.  The hierarchy in Swift is account->container->object.  The test 
targets a specific container of a specific account, which contains information 
on 10 million objects, and changes the "created_at" field for the first 2 
million entries.  First we create the container, add the 10 million entries to 
it, then we start the update cycle.  We see this slow performance on every 
update run, although if you run updates in succession without any break in 
between runs you do see an improvement in performance due to caching.


> So the first n-thousand records are Size1.
> You then start updating each of those...none of them are big
> enough...the new records get inserted at the first available empty slot
> (is that actually how this works or is there another row allocation
> strategy?).
[O'Toole, Eamonn] The SQL transactions are done through python.  This is the 
relevant section of code that deals with the object table, the container_stat 
table is updated by the triggers:

for rec in item_list:
query = '''
DELETE FROM object
WHERE name = ? AND (created_at < ?)
'''
if self.get_db_version(conn) >= 1:
query += ' AND deleted IN (0, 1)'
conn.execute(query, (rec['name'], rec['created_at']))
query = 'SELECT 1 FROM object WHERE name = ?'
if self.get_db_version(conn) >= 1:
query += ' AND deleted IN (0, 1)'
if not conn.execute(query, (rec['name'],)).fetchall():
conn.execute('''
INSERT INTO object (name, created_at, size,
content_type, etag, deleted)
VALUES (?, ?, ?, ?, ?, ?)
''', ([rec['name'], rec['created_at'], rec['size'],
  rec['content_type'], rec['etag'], rec['deleted']]))


> Now you go to update those records again...they can't fit in the 1st
> block...and some percentage of the data won't fit into the 2nd block
> (depends on the variability in size).  So, let's say half the records
> get relocatedeventually you reach homeostasis.
>
> Also...disk fragmentation could be affecting you too but I wouldn't
> expect an order of magnitude difference on that.
>
[O'Toole, Eamonn] Disk fragmentation is definitely a factor, but as you say 
defragging doesn't get you an order of magnitude improvement.
> Are you on Windows or Unix?
[O'Toole, Eamonn] Linux (Ubuntu)
>
> Also...are your records indexed for the updates?
[O'Toole, Eamonn] The only object table index that I can see is the 
ix_deleted_name index, and that isn't used by the table update logic.
>
>
>
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Advanced GEOINT Solutions Operating Unit
> Northrop Grumman Information Systems
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman

Re: [sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread Clemens Ladisch
O'Toole, Eamonn wrote:
>> From: Clemens Ladisch
>> Is it the DELETE or the INSERT that is slow, or both?
>
> Good question - I don't know
>
>> Are you addressing the records by ROWID or by name?
>
>  By name

So when the DELETE searches for the record to be deleted, it has to scan
the *entire* table ...

... unless you add an index on that field.

(And does this mean that the name, and not the ROWID, should be the
primary key?)

>> How big are the records, compared to the page size?
>> (And what is the output of sqlite3_analyzer?)
>
> I'll post the sqlite_analyzer output in a separate mail.

Nothing interesting there.

You might consider increasing the page size, which reduces page
management overhead slightly.  (All of today's file systems and OS
caches use a page size of 4 KB anyway, so using smaller DB pages does
not make sense.)


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

2012-11-09 Thread O'Toole, Eamonn


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Black, Michael (IS)
> Sent: 09 November 2012 14:26
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
> 
> >[O'Toole, Eamonn] This definitely sounds like it could be an issue.
> There is just one container_stat entry >confirmed by sqlite3_analyzer
> output which I'll post later).  So you're saying that the single
> container_stat table :entry is potentially being relocated very
> frequently the closer the update is to the beginning of the db?
> 
> Yesyou didn't say what your data flow is...but since it's account-
> based I assume you have a bunch of accounts that get preloaded.
[O'Toole, Eamonn] In Swift the db is used to store information on the 
containers.  The hierarchy in Swift is account->container->object.  The test 
targets a specific container of a specific account, which contains information 
on 10 million objects, and changes the "created_at" field for the first 2 
million entries.  First we create the container, add the 10 million entries to 
it, then we start the update cycle.  We see this slow performance on every 
update run, although if you run updates in succession without any break in 
between runs you do see an improvement in performance due to caching.


> So the first n-thousand records are Size1.
> You then start updating each of those...none of them are big
> enough...the new records get inserted at the first available empty slot
> (is that actually how this works or is there another row allocation
> strategy?).
[O'Toole, Eamonn] The SQL transactions are done through python.  This is the 
relevant section of code that deals with the object table, the container_stat 
table is updated by the triggers:

for rec in item_list:
query = '''
DELETE FROM object
WHERE name = ? AND (created_at < ?)
'''
if self.get_db_version(conn) >= 1:
query += ' AND deleted IN (0, 1)'
conn.execute(query, (rec['name'], rec['created_at']))
query = 'SELECT 1 FROM object WHERE name = ?'
if self.get_db_version(conn) >= 1:
query += ' AND deleted IN (0, 1)'
if not conn.execute(query, (rec['name'],)).fetchall():
conn.execute('''
INSERT INTO object (name, created_at, size,
content_type, etag, deleted)
VALUES (?, ?, ?, ?, ?, ?)
''', ([rec['name'], rec['created_at'], rec['size'],
  rec['content_type'], rec['etag'], rec['deleted']]))


> Now you go to update those records again...they can't fit in the 1st
> block...and some percentage of the data won't fit into the 2nd block
> (depends on the variability in size).  So, let's say half the records
> get relocatedeventually you reach homeostasis.
> 
> Also...disk fragmentation could be affecting you too but I wouldn't
> expect an order of magnitude difference on that.
> 
[O'Toole, Eamonn] Disk fragmentation is definitely a factor, but as you say 
defragging doesn't get you an order of magnitude improvement.
> Are you on Windows or Unix?
[O'Toole, Eamonn] Linux (Ubuntu)
> 
> Also...are your records indexed for the updates?
[O'Toole, Eamonn] The only object table index that I can see is the 
ix_deleted_name index, and that isn't used by the table update logic. 
> 
> 
> 
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Advanced GEOINT Solutions Operating Unit
> Northrop Grumman Information Systems
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread O'Toole, Eamonn
Sqlite_analyzer output:

/** Disk-Space Utilization Report For container10M.db
*** As of 2012-Nov-09 13:04:53

Page size in bytes 1024  
Pages in the whole file (measured) 3412536   
Pages in the whole file (calculated).. 3412535   
Pages that store data. 3412497 99.999% 
Pages on the freelist (per header) 38   0.001% 
Pages on the freelist (calculated) 39   0.001% 
Pages of auto-vacuum overhead. 00.0% 
Number of tables in the database.. 6 
Number of indices. 3 
Number of named indices... 1 
Automatically generated indices... 2 
Size of the file in bytes. 3494436864
Bytes of user payload stored.. 1768889095  50.6% 

*** Page counts for all tables with their indices 

OBJECT 3412485 99.999% 
SQLITE_MASTER. 60.0% 
INCOMING_SYNC. 20.0% 
OUTGOING_SYNC. 20.0% 
CONTAINER_STAT 10.0% 
SQLITE_SEQUENCE... 10.0% 

*** All tables and indices ***

Percentage of total database..  99.999%  
Number of entries. 2017  
Bytes of storage consumed. 3494396928
Bytes of payload.. 2941393332  84.2% 
Average payload per entry. 147.07
Average unused bytes per entry 19.51 
Average fanout 90.00 
Fragmentation.  75.9%
Maximum payload per entry. 929   
Entries that use overflow. 00.0% 
Index pages used.. 22459 
Primary pages used 3390038   
Overflow pages used... 0 
Total pages used.. 3412497   
Unused bytes on index pages... 2745050 11.9% 
Unused bytes on primary pages. 387423351   11.2% 
Unused bytes on overflow pages 0 
Unused bytes on all pages. 390168401   11.2% 

*** All tables ***

Percentage of total database..  59.3%
Number of entries. 1017  
Bytes of storage consumed. 2071290880
Bytes of payload.. 1768893042  85.4% 
Average payload per entry. 176.89
Average unused bytes per entry 18.62 
Average fanout 90.00 
Fragmentation.  59.3%
Maximum payload per entry. 929   
Entries that use overflow. 00.0% 
Index pages used.. 22459 
Primary pages used 2000286   
Overflow pages used... 0 
Total pages used.. 2022745   
Unused bytes on index pages... 2745050 11.9% 
Unused bytes on primary pages. 1834946059.0% 
Unused bytes on overflow pages 0 
Unused bytes on all pages. 1862396559.0% 

*** All indices **

Percentage of total database..  40.7%
Number of entries. 1000  
Bytes of storage consumed. 1423106048
Bytes of payload.. 1172500290  82.4% 
Average payload per entry. 117.25
Average unused bytes per entry 20.39 
Fragmentation.  99.910%  
Maximum payload per entry. 119   
Entries that use overflow. 00.0% 
Primary pages used 1389752   
Overflow pages used... 0 
Total pages used.. 1389752   
Unused bytes on primary pages. 203928746   14.3% 
Unused bytes on overflow pages 0 
Unused bytes on all pages. 203928746   14.3% 

*** Table CONTAINER_STAT *

Percentage of total database..   0.0%
Number of entries. 1 
Bytes of storage consumed. 1024  
Bytes of payload.. 185 18.1% 
Average payload per entry. 185.00
Average unused bytes per entry 826.00
Maximum payload per entry. 185   
Entries that use overflow. 00.0% 
Primary pages used 1 
Overflow pages used... 0 
Total pages used.. 1 
Unused bytes on primary pages. 826 80.7% 
Unused bytes on overflow pages 0 
Unused bytes on all pages. 826 80.7% 

*** Table INCOMING_SYNC and all its indices **

Percentage

Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

2012-11-09 Thread Black, Michael (IS)
>[O'Toole, Eamonn] This definitely sounds like it could be an issue.  There is 
>just one container_stat entry >confirmed by sqlite3_analyzer output which I'll 
>post later).  So you're saying that the single container_stat table :entry is 
>potentially being relocated very frequently the closer the update is to the 
>beginning of the db?

Yesyou didn't say what your data flow is...but since it's account-based I 
assume you have a bunch of accounts that get preloaded.
So the first n-thousand records are Size1.
You then start updating each of those...none of them are big enough...the new 
records get inserted at the first available empty slot (is that actually how 
this works or is there another row allocation strategy?).
Now you go to update those records again...they can't fit in the 1st 
block...and some percentage of the data won't fit into the 2nd block (depends 
on the variability in size).  So, let's say half the records get 
relocatedeventually you reach homeostasis.

Also...disk fragmentation could be affecting you too but I wouldn't expect an 
order of magnitude difference on that.

Are you on Windows or Unix?

Also...are your records indexed for the updates?



Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread O'Toole, Eamonn
Thanks Clemens, comments below:

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Clemens Ladisch
> Sent: 09 November 2012 12:26
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Anomalously slow performance on updates to early
> entries in a DB
> 
> O'Toole, Eamonn wrote:
> > We are seeing a performance anomaly on updates to existing object
> > records in the SQLite DB.  If the container DB is sufficiently large
> > (about 10 million objects, 3.3GB) then the time to update records at
> > the beginning of the database by order of entry is anomalously high.
> 
> Does VACUUM change anything?
[O'Toole, Eamonn] It improves the times, but there is still a very large 
performance penalty on updates to the beginning of the db.
> 
> > The time is particularly bad for the first approx. 100K records, is
> > somewhat better for the next 900K records, and settles down to
> > a consistent average from approx 1 million records on.  If this
> > consistent average time is around 7 seconds for 10,000 updates, then
> > we see times of about 170 seconds for 10,000 updates on the first
> 100K
> > records.  We don't see this anomalously high update time if we start
> > the updates after the first 1 million records.
> >
> > Note that table updates are performed by first DELETEing the entry
> and
> > then INSERTing the changed entry.
> 
> Is it the DELETE or the INSERT that is slow, or both?
[O'Toole, Eamonn] Good question - I don't know
> 
> Are you addressing the records by ROWID or by name?
[O'Toole, Eamonn] By name
> 
> > Note that Chunky_Ks (IRC) has suggested that B-tree rebalancing could
> > be the fundamental issue.
> 
> How big are the records, compared to the page size?
> (And what is the output of sqlite3_analyzer?)
[O'Toole, Eamonn] I'll post the sqlite_analyzer output in a separate mail.
> 
> What kind of updates are these?  If you're changing many values in the
> 'deleted' column in the same way, it might be possible that the
> ix_object_deleted_name index needs to be rebalanced.
[O'Toole, Eamonn] In our tests we change just the "created_at" field in the db 
entry.  However, more commonly the "size" and "etag" (md5sum) fields would also 
change.
> 
> > CREATE TRIGGER object_update BEFORE UPDATE ON object
> > BEGIN
> > SELECT RAISE(FAIL, 'UPDATE not allowed; DELETE and
> INSERT');
> > END;
> 
> Couldn't you implement this by running the code of both DELETE and
> INSERT triggers for the old and new records?  (This should avoid the
> need to rebalance the object table.)
> 
> 
[O'Toole, Eamonn] Good point.  On the face of it, I think that you're correct.  
I'll try this and see.

> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread O'Toole, Eamonn
Thanks Michael.  Comments below:

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Black, Michael (IS)
> Sent: 09 November 2012 13:49
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Anomalously slow performance on updates to early
> entries in a DB
> 
> Hmmm...is this a disk head seeking problem?
> You've got several TEXT entries which are either NULL or default to '';
> I did a small experiment: on 3.7.13
> CREATE TABLE container_stat (
> account TEXT,
> container TEXT,
> created_at TEXT,
> put_timestamp TEXT DEFAULT '0',
> delete_timestamp TEXT DEFAULT '0',
> object_count INTEGER,
> bytes_used INTEGER,
> reported_put_timestamp TEXT DEFAULT '0',
> reported_delete_timestamp TEXT DEFAULT '0',
> reported_object_count INTEGER DEFAULT 0,
> reported_bytes_used INTEGER DEFAULT 0,
> hash TEXT default '',
> id TEXT,
> status TEXT DEFAULT '',
> status_changed_at TEXT DEFAULT '0',
> metadata TEXT DEFAULT '',
> x_container_sync_point1 INTEGER DEFAULT -1,
> x_container_sync_point2 INTEGER DEFAULT -1);
> insert into container_stat(id,status,status_changed_at)
> values('id1','status1','');
> insert into container_stat(id,status,status_changed_at)
> values('id2','status2','');
> insert into container_stat(id,status,status_changed_at)
> values('id3','status3','');
> insert into container_stat(id,status,status_changed_at)
> values('id4','status4','');
> insert into container_stat(id,status,status_changed_at)
> values('id5','status5','');
> delete from container_stat where account='id1';
> insert into container_stat(account,status,status_changed_at)
> values('id1','status1 change1 to something
> else','status_changhed_at_1');
> 
> If you look at the order of data BEFORE the delete/insert occurs you
> get this (using "strings");
> id5status5
> id4status4
> id3status3
> id2status2
> id1status1
> 
> After a delete/insert where one of the fields grows in size you get
> this:
> id1status1 change1 to something
> elsestatus_changed_at_1
> id5status5
> id4status4
> id3status3
> id2status2
> id1status1
> 
> So...the probability of an update needing to move due to larger data
> increases the closer you are to the beginning of the database.  Each
> update would reduce the likelihood of that record getting relocated
> again as the fields grow in size.
>
[O'Toole, Eamonn] This definitely sounds like it could be an issue.  There is 
just one container_stat entry (confirmed by sqlite3_analyzer output which I'll 
post later).  So you're saying that the single container_stat table entry is 
potentially being relocated very frequently the closer the update is to the 
beginning of the db?


 
> If you use default values that are 2 sigma of the sizes of your strings
> (or maybe just 2X the average length or so) you would reduce the head
> seek time on updating records.
> 
> It basically sounds possibly like your disk head is slewing from the
> beginning of the file to the end on many of your updates.
[O'Toole, Eamonn] What we do see is that the disk utilization is at 100% during 
updates to the beginning of the file, but the actual throughput in terms of 
MB/s is low - which tallies with your hypothesis.

> 
> Also...would "insert or update" help you a bit?
> http://www.sqlite.org/lang_conflict.html
[O'Toole, Eamonn] I'll have a look at this, thanks.
> 
> 
> 
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Advanced GEOINT Solutions Operating Unit
> Northrop Grumman Information Systems
> 
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org]
> on behalf of O'Toole, Eamonn [eamonn.oto...@hp.com]
> Sent: Friday, November 09, 2012 5:07 AM
> To: sqlite-users@sqlite.org
> Subject: EXT :[sqlite] Anomalously slow performance on updates to early
> entries in a DB
> 
> Hello all,
> 
> First of all, I'm a complete novice with respect to SQLite so apologies
> if there is an obvious answer to my question.  I've also posted this
> question in IRC, and it was suggested that I post the question to this
> mailing list.  We're running a test-bed of an object store (Openstack
> Swift) which uses SQLite to record information on the objects that are
> stored in a container.  The t

Re: [sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread Black, Michael (IS)
Hmmm...is this a disk head seeking problem?
You've got several TEXT entries which are either NULL or default to '';
I did a small experiment: on 3.7.13
CREATE TABLE container_stat (
account TEXT,
container TEXT,
created_at TEXT,
put_timestamp TEXT DEFAULT '0',
delete_timestamp TEXT DEFAULT '0',
object_count INTEGER,
bytes_used INTEGER,
reported_put_timestamp TEXT DEFAULT '0',
reported_delete_timestamp TEXT DEFAULT '0',
reported_object_count INTEGER DEFAULT 0,
reported_bytes_used INTEGER DEFAULT 0,
hash TEXT default '',
id TEXT,
status TEXT DEFAULT '',
status_changed_at TEXT DEFAULT '0',
metadata TEXT DEFAULT '',
x_container_sync_point1 INTEGER DEFAULT -1,
x_container_sync_point2 INTEGER DEFAULT -1);
insert into container_stat(id,status,status_changed_at) 
values('id1','status1','');
insert into container_stat(id,status,status_changed_at) 
values('id2','status2','');
insert into container_stat(id,status,status_changed_at) 
values('id3','status3','');
insert into container_stat(id,status,status_changed_at) 
values('id4','status4','');
insert into container_stat(id,status,status_changed_at) 
values('id5','status5','');
delete from container_stat where account='id1';
insert into container_stat(account,status,status_changed_at) 
values('id1','status1 change1 to something else','status_changhed_at_1');

If you look at the order of data BEFORE the delete/insert occurs you get this 
(using "strings");
id5status5
id4status4
id3status3
id2status2
id1status1

After a delete/insert where one of the fields grows in size you get this:
id1status1 change1 to something 
elsestatus_changed_at_1
id5status5
id4status4
id3status3
id2status2
id1status1

So...the probability of an update needing to move due to larger data increases 
the closer you are to the beginning of the database.  Each update would reduce 
the likelihood of that record getting relocated again as the fields grow in 
size.

If you use default values that are 2 sigma of the sizes of your strings (or 
maybe just 2X the average length or so) you would reduce the head seek time on 
updating records.

It basically sounds possibly like your disk head is slewing from the beginning 
of the file to the end on many of your updates. 

Also...would "insert or update" help you a bit?
http://www.sqlite.org/lang_conflict.html



Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Advanced GEOINT Solutions Operating Unit
Northrop Grumman Information Systems


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of O'Toole, Eamonn [eamonn.oto...@hp.com]
Sent: Friday, November 09, 2012 5:07 AM
To: sqlite-users@sqlite.org
Subject: EXT :[sqlite] Anomalously slow performance on updates to early entries 
in a DB

Hello all,

First of all, I'm a complete novice with respect to SQLite so apologies if 
there is an obvious answer to my question.  I've also posted this question in 
IRC, and it was suggested that I post the question to this mailing list.  We're 
running a test-bed of an object store (Openstack Swift) which uses SQLite to 
record information on the objects that are stored in a container.  The table 
itself (called "object") is very simple, it contains the modification time of 
the object, the size of object, the md5sum of the object, and the content-type 
of the object.  We are seeing a performance anomaly on updates to existing 
object records in the SQLite DB.  If the container DB is sufficiently large 
(about 10 million objects, 3.3GB) then the time to update records at the 
beginning of the database by order of entry is anomalously high.  The time is 
particularly bad for the first approx. 100K records, is somewhat better for the 
next 900K records, and settles down to a consistent average from
  approx 1 million records on.  If this consistent average time is around 7 
seconds for 10,000 updates, then we see times of about 170 seconds for 10,000 
updates on the first 100K records.  We don't see this anomalously high update 
time if we start the updates after the first 1 million records.

Note that table updates are performed by first DELETEing the entry and then 
INSERTing the changed entry.

Does anybody have any idea why we're seeing this behaviour, and what we can do 
to fix it?

Re: [sqlite] Write to a View

2012-11-09 Thread Clemens Ladisch
Marco Bambini wrote:
> From the official documentation:
> You cannot DELETE, INSERT, or UPDATE a view. Views are read-only in SQLite.
> However, in many cases you can use an INSTEAD OF trigger on the view to 
> accomplish the same thing.
>
> Can someone clarifies this point?
> Seems like it is possible to write into a view but I do not understand what 
> if the correct way to perform the operation.

The trigger must change the view's base table(s) appropriately:

sqlite> CREATE TABLE pets(name, owner);
sqlite> CREATE VIEW my_pets AS SELECT name FROM pets WHERE owner='me';
sqlite> INSERT INTO my_pets VALUES('Fluffy');
Error: cannot modify my_pets because it is a view
sqlite> CREATE TRIGGER my_pets_insert
INSTEAD OF INSERT ON my_pets
BEGIN
  INSERT INTO pets(name, owner) VALUES(NEW.name, 'me');
END;
sqlite> INSERT INTO my_pets VALUES('Fluffy');
sqlite> SELECT * FROM my_pets;
Fluffy


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about date question

2012-11-09 Thread Igor Tandetnik
YAN HONG YE  wrote:
> sqlite> select 
> fmn,DATE('NOW'),supplierDate,date('now'),strftime(supplierDate),s
> trftime(supplierDate)-date('now') from t93c_adl where 
> strftime(supplierDate) e('now') limit 10;
> 854954|2012-11-09|12/01/2012|2012-11-09|12/01/2012|-2000
> 
> why use supplierDate,date('now'),strftime(supplierDate) the result is not 
> same result format?
> the one is 2012-11-09
> another is 12/01/2012

How are *we* supposed to know why you chose to put data in the wrong format 
into *your* database?

> and how to get the subtraction of two column?

I'll tell you - *after* you change supplierDate column to a format that 
date/time functions recognize. I showed you how.
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Write to a View

2012-11-09 Thread Marco Bambini
>From the official documentation:
You cannot DELETE, INSERT, or UPDATE a view. Views are read-only in SQLite.
However, in many cases you can use an INSTEAD OF trigger on the view to 
accomplish the same thing.

Can someone clarifies this point?
Seems like it is possible to write into a view but I do not understand what if 
the correct way to perform the operation.

Thanks.
--
Marco Bambini
http://www.sqlabs.com








___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread Clemens Ladisch
O'Toole, Eamonn wrote:
> We are seeing a performance anomaly on updates to existing object
> records in the SQLite DB.  If the container DB is sufficiently large
> (about 10 million objects, 3.3GB) then the time to update records at
> the beginning of the database by order of entry is anomalously high.

Does VACUUM change anything?

> The time is particularly bad for the first approx. 100K records, is
> somewhat better for the next 900K records, and settles down to
> a consistent average from approx 1 million records on.  If this
> consistent average time is around 7 seconds for 10,000 updates, then
> we see times of about 170 seconds for 10,000 updates on the first 100K
> records.  We don't see this anomalously high update time if we start
> the updates after the first 1 million records.
>
> Note that table updates are performed by first DELETEing the entry and
> then INSERTing the changed entry.

Is it the DELETE or the INSERT that is slow, or both?

Are you addressing the records by ROWID or by name?

> Note that Chunky_Ks (IRC) has suggested that B-tree rebalancing could
> be the fundamental issue.

How big are the records, compared to the page size?
(And what is the output of sqlite3_analyzer?)

What kind of updates are these?  If you're changing many values in the
'deleted' column in the same way, it might be possible that the
ix_object_deleted_name index needs to be rebalanced.

> CREATE TRIGGER object_update BEFORE UPDATE ON object
> BEGIN
> SELECT RAISE(FAIL, 'UPDATE not allowed; DELETE and INSERT');
> END;

Couldn't you implement this by running the code of both DELETE and
INSERT triggers for the old and new records?  (This should avoid the
need to rebalance the object table.)


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Anomalously slow performance on updates to early entries in a DB

2012-11-09 Thread O'Toole, Eamonn
Hello all,

First of all, I'm a complete novice with respect to SQLite so apologies if 
there is an obvious answer to my question.  I've also posted this question in 
IRC, and it was suggested that I post the question to this mailing list.  We're 
running a test-bed of an object store (Openstack Swift) which uses SQLite to 
record information on the objects that are stored in a container.  The table 
itself (called "object") is very simple, it contains the modification time of 
the object, the size of object, the md5sum of the object, and the content-type 
of the object.  We are seeing a performance anomaly on updates to existing 
object records in the SQLite DB.  If the container DB is sufficiently large 
(about 10 million objects, 3.3GB) then the time to update records at the 
beginning of the database by order of entry is anomalously high.  The time is 
particularly bad for the first approx. 100K records, is somewhat better for the 
next 900K records, and settles down to a consistent average from
  approx 1 million records on.  If this consistent average time is around 7 
seconds for 10,000 updates, then we see times of about 170 seconds for 10,000 
updates on the first 100K records.  We don't see this anomalously high update 
time if we start the updates after the first 1 million records.

Note that table updates are performed by first DELETEing the entry and then 
INSERTing the changed entry.

Does anybody have any idea why we're seeing this behaviour, and what we can do 
to fix it?

Note that Chunky_Ks (IRC) has suggested that B-tree rebalancing could be the 
fundamental issue.

Thanks.


SQL Schema follows:

1 database: main

Tables:
container_stat
incoming_sync
object
outgoing_sync

container_stat:
CREATE TABLE container_stat (
account TEXT,
container TEXT,
created_at TEXT,
put_timestamp TEXT DEFAULT '0',
delete_timestamp TEXT DEFAULT '0',
object_count INTEGER,
bytes_used INTEGER,
reported_put_timestamp TEXT DEFAULT '0',
reported_delete_timestamp TEXT DEFAULT '0',
reported_object_count INTEGER DEFAULT 0,
reported_bytes_used INTEGER DEFAULT 0,
hash TEXT default '',
id TEXT,
status TEXT DEFAULT '',
status_changed_at TEXT DEFAULT '0',
metadata TEXT DEFAULT '',
x_container_sync_point1 INTEGER DEFAULT -1,
x_container_sync_point2 INTEGER DEFAULT -1

incoming_sync:
CREATE TABLE incoming_sync (
remote_id TEXT UNIQUE,
sync_point INTEGER,
updated_at TEXT DEFAULT 0
);
CREATE TRIGGER incoming_sync_insert AFTER INSERT ON incoming_sync
BEGIN
UPDATE incoming_sync
SET updated_at = STRFTIME('%s', 'NOW')
WHERE ROWID = new.ROWID;
END;
CREATE TRIGGER incoming_sync_update AFTER UPDATE ON incoming_sync
BEGIN
UPDATE incoming_sync
SET updated_at = STRFTIME('%s', 'NOW')
WHERE ROWID = new.ROWID;
END;

object:
CREATE TABLE object (
ROWID INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
created_at TEXT,
size INTEGER,
content_type TEXT,
etag TEXT,
deleted INTEGER DEFAULT 0
);
CREATE INDEX ix_object_deleted_name ON object (deleted, name);
CREATE TRIGGER object_delete AFTER DELETE ON object
BEGIN
UPDATE container_stat
SET object_count = object_count - (1 - old.deleted),
bytes_used = bytes_used - old.size,
hash = chexor(hash, old.name, old.created_at);
END;
CREATE TRIGGER object_insert AFTER INSERT ON object
BEGIN
UPDATE container_stat
SET object_count = object_count + (1 - new.deleted),
bytes_used = bytes_used + new.size,
hash = chexor(hash, new.name, new.created_at);
END;
CREATE TRIGGER object_update BEFORE UPDATE ON object
BEGIN
SELECT RAISE(FAIL, 'UPDATE not allowed; DELETE and INSERT');
END;

outgoing_sync:
CREATE TABLE outgoing_sync (
   remote_id TEXT UNIQUE,
sync_point INTEGER,
updated_at TEXT DEFAULT 0
);
CREATE TRIGGER outgoing_sync_insert AFTER INSERT ON outgoing_sync
BEGIN
UPDATE outgoing_sync
SET updated_at = STRFTIME('%s', 'NOW')
WHERE ROWID = new.ROWID;
END;
CREATE TRIGGER outgoing_sync_update AFTER UPDATE ON outgoing_sync
BEGIN
UPDATE outgoing_sync
SET updated_at = STRFTIME('%s', 'NOW'

Re: [sqlite] about date question

2012-11-09 Thread Drake Wilson
Quoth YAN HONG YE , on 2012-11-09 08:35:25 +:
>   
>
> //--why have noting in the result?

This is not a realtime chat system.  If you won't wait even ten
minutes before squawking about the same thing again, a mailing list
may not be for you.  Are you understanding any of the responses at
all?  Give a sign of it if so!

And "y" and "Y" are not the same thing.

   ---> Drake Wilson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about date question

2012-11-09 Thread Drake Wilson
Quoth YAN HONG YE , on 2012-11-09 08:28:53 +:
> The result let crazy.

Because they're STRINGS!  There is NO DATE TYPE in SQLite!  Not to
mention if those values are intended to be ISO 8601, they should have
zero padding; it should be "2012-09-07" and such---in which case they
will compare correctly because that's the way that date format was
designed, but it won't happen just because.

The string "29" comes before the string "3" because "2" comes before
"3".  It doesn't automatically figure out what a string is "supposed"
to be and compare accordingly.

   ---> Drake Wilson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] about date question

2012-11-09 Thread YAN HONG YE
sqlite> select strftime(importdate) from t93c_adl3 limit 5;
2012-9-13
2012-9-5
2012-10-9
2012-10-9
2012-10-9
sqlite> select strftime('%y-%m-%d',importdate) from t93c_adl3 limit 5;

 
//--why have noting in the result?




sqlite>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] about date question

2012-11-09 Thread YAN HONG YE
The result let crazy.

sqlite> select distinct importdate from t93c_adl3 order by importdate desc limit
 12;
2012-9-7
2012-9-6
2012-9-5
2012-9-4
2012-9-3
2012-9-29 //---here ,  why?
2012-9-28
2012-9-27
2012-9-26
2012-9-25
2012-9-24
2012-9-20
sqlite>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about date question

2012-11-09 Thread Drake Wilson
Quoth YAN HONG YE , on 2012-11-09 05:45:06 +:
> why use supplierDate,date('now'),strftime(supplierDate) the result is not 
> same result format?
> the one is 2012-11-09
> another is 12/01/2012
> ?
> and how to get the subtraction of two column?

SQLite calendar functions handles textual dates in ISO 8601 format,
but you're not asking strftime to extract a date from supplierDate and
format that; you're asking it to use supplierDate as a _format string_
and since it does not contain any % characters it is returned as-is.
The first argument to strftime is a format string, and the actual date
information is supplied as separate arguments.

You seem to be trying to treat SQLite as though it actually has a date
data type, but it doesn't.  There are different _representations_ of
dates and times and durations and so forth, and only some of them will
work in some situations.  Now _please_ try to understand this _before_
throwing another dozen SELECT statements at the list.  Don't just ask
people to give you the magic formula or ask why it doesn't work when
you plug random things together; try to learn the underlying
principles properly.

   ---> Drake Wilson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users