Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-24 Thread Simon Slavin

On 25 Jun 2013, at 1:41am, jhnlmn  wrote:

> Using trigger during insert slows down insert by about 20%.
> This is not much, but unnecessary as long as AUTOINCREMENT
> does not reach the maximum value.
> As other people mentioned, this is unlikely to happen
> in our lifetime. But if it will, then this trigger/second table approach
> may be a solution.

It won’t.

2^64 is about 2e19.  Let’s suppose you could find a storage system that would 
run 24/7 and not break down for a long long time.  Let’s suppose you can insert 
a million rows a second.  It would take about half a million years of 
continuous operation to overflow the AUTOINC number.

I bet you a slap up dinner at the best hotel in the world if you still care 
about this issue in even a hundred thousand years time.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-24 Thread jhnlmn
Keith Medcalf  writes:
> ... you can create a trigger that adds the rowid of
> rows inserted (ie, that you need to update) into another table.  

Using trigger during insert slows down insert by about 20%.
This is not much, but unnecessary as long as AUTOINCREMENT
does not reach the maximum value.
As other people mentioned, this is unlikely to happen
in our lifetime. But if it will, then this trigger/second table approach
may be a solution.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-24 Thread jhnlmn
Peter Aronson  writes:
> create trigger inter_update before update on my_table
> begin
>    select raise(ignore) where my_function() = 1;
> end;

No, this is the worst approach so far.
This "raise(ignore)" does abort that single update 
of that particular row, but the loop continues.
So, if I called
UPDATE T set C1 = calculation(C2) where C1 is NULL
on 100,000 rows and my_function will begin returning 1 after 10,000 calls,
then my_function will be called 100,000 times and calculation will be called
100,000 times, but only 10,000 rows will be actually updated.
And the time of such semi-aborted update is almost the same 
as time of a single update without any triggers 
(almost 5 seconds, which is close to the timeout time).
But then I will have to call update again, this time it will make 90,000
calls, etc, etc, etc.
The total time is 6 times worse than time of running single update query
without any limits. 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-23 Thread RSmith
Bit long ago to recall exactly, but to be honest I think it was quite more sinister even, if memory serves, I was modifying the 
rowids too. Either way I avoid messing with rowids or relying on them altogether now as a short-cut to updating - but it s quite 
safe to do so.


As an aside, while trying to remember my predicament and reading the link Donald sent, it prompted me to simply compute what a 
logger would have to do to exhaust the possible rowid count as it stands now as an assessment of whether it is safe to assume there 
will be enough rowids to last a good amount of time.


Basically worked out that if you wrote (assuming your hardware was capable) a thousand records every one-thousandth of a second (or 
simply a million records a second), you would have 86,400,000,000 (~86 Billion) records per day and 365.224 days per year, that's 
31,555,353,600,000 (~31 Trillion) records per year.
Now dividing the max_rowid by it (9223372036854775807 / 3155535360) yields ~292,291 years.  That's almost three-hundred-thousand 
years.
An entire human lifespan is less than a 100th of a percent of that amount of time, or put in a less mathematical way, it is several 
times longer than the entire recorded human history up to now.


I think it's safe to assume that your hardware will die before the rowids do. :)
Cheers!


On 2013/06/23 14:33, Donald Griggs wrote:

Regarding:  ...then went horribly wrong at some point after rowids consumed all 
of 32 bits...

If your application treated ROWIDs as 32-bit integers, that may possibly have been the cause of your problem, since  ROWID's are 
64-bit integers.


The maximum ROWID is 9223372036854775807 http://www.sqlite.org/autoinc.html
and would seem hard to exhaust with your application in our lifetimes.






___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-23 Thread Donald Griggs
Regarding:   ...then went horribly wrong at some point after rowids
consumed all of 32 bits...

If your application treated ROWIDs as 32-bit integers, that may possibly
have been the cause of your problem, since  ROWID's are 64-bit integers.

The maximum ROWID is  9223372036854775807
http://www.sqlite.org/autoinc.html
and would seem hard to exhaust with your application in our lifetimes.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-23 Thread RSmith


On 2013/06/23 00:15, Simon Slavin wrote:

//...
UPDATE myTable SET C1 = [calculation] WHERE rowid > [lastupdatedrowid]

then update the updateprogress table.  Since there is no searching needed, 
there’s no need for scanning the table, and no need for any indexes.

Simon.


One caution on this approach - I once made a logger which logged to SQLite, several devices logging once a second, so quite a 
well-paced consumption of rowids.
Another app/thread came in and archived anything older than 6 months by reading, compressing, deleting (to keep querying to 
acceptable speeds).

Another thread still validated/updated logged readings - this one used this scheme 
of 'WHERE rowid > '+lastCheckedRowID;

It all worked perfectly but then went horribly wrong at some point after rowids consumed all of 32 bits (I think), the next rowids 
were not even starting at the beginning again (where stuff were deleted first), it seemed almost random at times... Brought a whole 
system to its knees. (No catastrophic failures, just locking up in forever-running queries etc). I just changed how the system 
worked at the time, did not have time to research the problem more, so no great advice on what to do if it happens.


Not sure if this is still the behaviour with newest version of SQLite - maybe 
someone can confirm.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-22 Thread Larry Brasfield
Keith Medcalf wrote:
> > I considered that as well, but it's not clear how much benefit you get
> > over the autoincrement scheme: the PK-index is there either way, so
> > that's not a slowdown.  The split table approach also makes
> > query-writing and indexing more complex, so at a minimum you'd probably
> > want to make a view that runs a UNION ALL on the two tables.
>
> Whatever for?  One table contains the data, the other contains the rowids
from the first table that require updating.  You would not be able to UNION
(ALL or otherwise) the rows from the two tables -- they are entirely
different.

He might want to write a query for that view which selects the rows in the
first table which require no updating and unions that with another select
which gets the ones in need of updating and also substitutes the computed
values which would be made durable by the update.

Cheers,
-- 
Larry Brasfield
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-22 Thread Keith Medcalf

> I considered that as well, but it's not clear how much benefit you get
> over the autoincrement scheme: the PK-index is there either way, so
> that's not a slowdown.  The split table approach also makes
> query-writing and indexing more complex, so at a minimum you'd probably
> want to make a view that runs a UNION ALL on the two tables.

Whatever for?  One table contains the data, the other contains the rowids from 
the first table that require updating.  You would not be able to UNION (ALL or 
otherwise) the rows from the two tables -- they are entirely different.

---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-22 Thread Simon Slavin

On 22 Jun 2013, at 10:50pm, Ryan Johnson  wrote:

> I considered that as well, but it's not clear how much benefit you get over 
> the autoincrement scheme: the PK-index is there either way, so that's not a 
> slowdown.  The split table approach also makes query-writing and indexing 
> more complex, so at a minimum you'd probably want to make a view that runs a 
> UNION ALL on the two tables.

If this really is a case of filling in missing information only in the 'new' 
rows, then there’s no need to search for lines with a NULL value in at all.  
Just keep track of which rows you’ve updated so far:

CREATE TABLE updateprogress (tableName TEXT, lastUpdatedRowId INTEGER)

When you have the time to update a table, do

UPDATE myTable SET C1 = [calculation] WHERE rowid > [lastupdatedrowid]

then update the updateprogress table.  Since there is no searching needed, 
there’s no need for scanning the table, and no need for any indexes.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-22 Thread Ryan Johnson

On 21/06/2013 8:41 PM, Keith Medcalf wrote:

No, in my case user does not touch the DB - he can only add or delete
items,
but all DB modifications are done by my app and I have a full control over
all the values. All I need is to find the most efficient way to do that.

Therefore, I do not believe that I need any triggers. I guess they will
simply slow down insertion and update (however I did not measure this
yet).
In those rare cases when there is a need to reset attributes on an already
existing item, I would rather delete the entire row and insert a new one
and
treat it as a completely new item.

Perhaps you should reconsider using triggers.  For example, you can create a 
trigger that adds the rowid of rows inserted (ie, that you need to update) into 
another table.  Periodically from your application process the rows indicated 
in this table to do your updates, and delete the rowids for the rows you have 
updated.  This will add almost no discernible overhead to your table updates, 
plus it will give you a table with the rowids of the rows you need to visit and 
update.  If you wish to do them in small batches then you can do so.
I considered that as well, but it's not clear how much benefit you get 
over the autoincrement scheme: the PK-index is there either way, so 
that's not a slowdown.  The split table approach also makes 
query-writing and indexing more complex, so at a minimum you'd probably 
want to make a view that runs a UNION ALL on the two tables.


Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread Keith Medcalf
> No, in my case user does not touch the DB - he can only add or delete
> items,
> but all DB modifications are done by my app and I have a full control over
> all the values. All I need is to find the most efficient way to do that.
> 
> Therefore, I do not believe that I need any triggers. I guess they will
> simply slow down insertion and update (however I did not measure this
> yet).
> In those rare cases when there is a need to reset attributes on an already
> existing item, I would rather delete the entire row and insert a new one
> and
> treat it as a completely new item.

Perhaps you should reconsider using triggers.  For example, you can create a 
trigger that adds the rowid of rows inserted (ie, that you need to update) into 
another table.  Periodically from your application process the rows indicated 
in this table to do your updates, and delete the rowids for the rows you have 
updated.  This will add almost no discernible overhead to your table updates, 
plus it will give you a table with the rowids of the rows you need to visit and 
update.  If you wish to do them in small batches then you can do so.


---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread Peter Aronson


From: jhnlmn <jhn...@yahoo.com>
To: sqlite-users@sqlite.org 
Sent: Friday, June 21, 2013 3:20 PM
Subject: Re: [sqlite] How to interrupt a long running update without roll back?

Alas, none of the above answered my original question "How to interrupt a
long running update without roll back". But, guess, I will not get an answer.

I don't know if it'll work for your application (there being a cost), but a 
before trigger that performs an raise(ignore) seems to interrupt an update 
without a rollback.  So a trigger of the form:

create trigger inter_update before update on my_table
begin
   select raise(ignore) where my_function() = 1;
end;

Seems like it should interrupt an update when you take some action to make 
my_function() return 1 without causing a rollback, according to my quick test.

Peter
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread jhnlmn
Ryan Johnson  writes:
> So if I understand correctly, you start out with only C2, use C2 to 
> drive an expensive/slow computation that derives all other fields in the 
> row, and set C1 != NULL when the derivation completes? And you don't 
> want a trigger because that would pin the slow computation on the 
> inserting (application) thread rather than the batch-updating 
> (background) thread?

Yes.

> In that case, your AUTOINCREMENT approach is probably the best you can do...

There is also a dumb approach of copying entire DB to a new file,
inserting/deleting/updating rows in this file, then renaming this file back.
This requires an custom lock, otherwise is simple and fast.

Alas, none of the above answered my original question "How to interrupt a
long running update without roll back". But, guess, I will not get an answer.

Thank you for the discussion.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread Ryan Johnson

On 21/06/2013 2:29 PM, jhnlmn wrote:

Ryan Johnson  writes:
Q1: Is C1 *always* NULL in a newly-inserted row, or does the application
sometimes insert some arbitrary value?
Q2: Does the transition from NULL to calculation(C2) mean something
special to the application?

...

Scenario 3: C1=NULL has a specific meaning in your application

C1=NULL is just a flag, whose only meaning is that update was not done yet
on the recently inserted row.

If in Windows Explorer you enter a folder with few thousand music files,
then Explorer will show the list of files in the left column of detailed
view instantaneously, but then it will begin slowly fill up other columns
(Title, Artist, Album).
My application is not an Explorer, but it has a similar pattern - show the
list of item names ASAP and then prepare and store item attributes on the
background. This is a pretty common pattern, I guess. So, in my minimal
example C2 is the item url and C1 is item attributes (in reality there will
be several columns with attributes: C11, C12, etc., but C1=NULL means that
attributes were not prepared yet and C1!=NULL means that attributes are ready).


Scenario 2: C1=calculation(C2) is the default to use when the user fails

to supply C1;

No, in my case user does not touch the DB - he can only add or delete items,
but all DB modifications are done by my app and I have a full control over
all the values. All I need is to find the most efficient way to do that.

Therefore, I do not believe that I need any triggers. I guess they will
simply slow down insertion and update (however I did not measure this yet).
In those rare cases when there is a need to reset attributes on an already
existing item, I would rather delete the entire row and insert a new one and
treat it as a completely new item.
So if I understand correctly, you start out with only C2, use C2 to 
drive an expensive/slow computation that derives all other fields in the 
row, and set C1 != NULL when the derivation completes? And you don't 
want a trigger because that would pin the slow computation on the 
inserting (application) thread rather than the batch-updating 
(background) thread?


In that case, your AUTOINCREMENT approach is probably the best you can do...

Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread jhnlmn
> Ryan Johnson  writes:
> Q1: Is C1 *always* NULL in a newly-inserted row, or does the application 
> sometimes insert some arbitrary value?
> Q2: Does the transition from NULL to calculation(C2) mean something 
> special to the application?
...
> Scenario 3: C1=NULL has a specific meaning in your application

C1=NULL is just a flag, whose only meaning is that update was not done yet
on the recently inserted row.

If in Windows Explorer you enter a folder with few thousand music files,
then Explorer will show the list of files in the left column of detailed
view instantaneously, but then it will begin slowly fill up other columns
(Title, Artist, Album). 
My application is not an Explorer, but it has a similar pattern - show the
list of item names ASAP and then prepare and store item attributes on the
background. This is a pretty common pattern, I guess. So, in my minimal
example C2 is the item url and C1 is item attributes (in reality there will
be several columns with attributes: C11, C12, etc., but C1=NULL means that
attributes were not prepared yet and C1!=NULL means that attributes are ready).

> Scenario 2: C1=calculation(C2) is the default to use when the user fails 
to supply C1;

No, in my case user does not touch the DB - he can only add or delete items,
but all DB modifications are done by my app and I have a full control over
all the values. All I need is to find the most efficient way to do that.

Therefore, I do not believe that I need any triggers. I guess they will
simply slow down insertion and update (however I did not measure this yet).
In those rare cases when there is a need to reset attributes on an already
existing item, I would rather delete the entire row and insert a new one and
treat it as a completely new item.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread Ryan Johnson

On 21/06/2013 2:55 AM, jhnlmn wrote:

Ryan Johnson  writes:
histo = list(conn.execute('select min(rowid) lo, max(rowid) hi, count(*)
n from lineitem group by rowid/1' order by lo))

...

 a,b,n = buckets[-1]


Thank you for your reply.
You code is not very correct ("a" always remain 0),
but I got your idea (and tested it).
Bucket creation adds about 50% to the time of update, which is acceptable.

By the way, I also posted my question at
http://stackoverflow.com/questions/17099491/sqlite3-how-to-interrupt-a-long-running-update-without-roll-back

So, if you care, you may wish to post your "bucket" solution there as well,
in case somebody will need it.


Why not use a trigger on T to update C1 whenever it gets set to NULL?

My case is actually simpler: C1 is NULL only on newly inserted rows
(I should have said this from the beginning).

So, I just realized that if I enable AUTOINCREMENT feature,
then I do not need to process entire rowid range,
but only range of rows added after the last update.
This range should not have many holes and I can process this range
10,000 rowids at a time.
This will work until rowid will reach the largest possible integer.
This sounds like a good time to re-examine what end goal you're really 
after here (rather than how to implement this particular way of 
achieving that goal). Some questions worth asking yourself might include:


Q1: Is C1 *always* NULL in a newly-inserted row, or does the application 
sometimes insert some arbitrary value?


Q2: Does the transition from NULL to calculation(C2) mean something 
special to the application?


Scenario 1: C1=calculation(C2) is a constraint.
Solution: mediate everything through a view:

create table T_data(..., C2, ...); -- C1 is conspicuously absent here
create view T as select *, calculation(C2) C1 from T_data;
create trigger T_update instead of update on T begin update T_data 
set ..., C2=new.C2, ... where rowid = new.rowid;

end; -- need similar "instead of insert" as well

Scenario 2: C1=calculation(C2) is the default to use when the user fails 
to supply C1; the initial C1=NULL is a workaround for inability to 
express "DEFAULT calculation(C1)". Alternatively, C1 is a constraint, 
but calculation() is expensive enough we don't want to recompute it.

Solution: use an after-insert trigger:

create trigger tc1 after insert on T when new.c1 is NULL begin update T 
set c1=calculation(c2) where rowid=new.rowid; end;


Scenario 3: C1=NULL has a specific meaning in your application, and it 
matters when C1 transitions from NULL to calculation(C2).
Solution: your AUTOINCREMENT + batch update is probably the best choice. 
The timing of the batch update almost certainly should depend on 
something other than number of rows inserted so far; otherwise a trigger 
could run the batch whenever a large enough rowid is created, which 
would imply that a normal after update trigger would work. I strongly 
suspect this is either *not* your actual scenario, or you have a lurking 
bug: if NULL C1 really matters, and it's important for the the 
transition away from NULL not to happen right away, that suggests 
something must happen in the system "promote" "old-enough" NULL C1 to 
non-NULL status. However, that batch, whatever the trigger, could catch 
a just-barely-inserted NULL C1 and change it to C2 before any reader 
notices it was even there. If that is OK, then there's a very good 
chance that you actually have Scenario 2.


Thoughts?
Ryan


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread jhnlmn
Simon Slavin  writes:
> Nevertheless, this is the way the problem should be solved according to
the design of SQLite.  What you are
> doing is searching for NULL entries in a table. 
> The way you speed up a search is to create an index

I do not mind adding indexes to columns, which are searched by the user.
But my C1 is for internal use and I do not really need to search it.

All I need is to loop through all rows, perform some comparison,
update row and commit it (I would prefer to commit immediately
after updating each row instead of after entire update query is finished).
So, in this particular case SQL is more of an obstacle. 
It would have been easier to have a low level API for looping
and updating rows, like some simpler DB engines used to have.
Or, as I wrote before, just commit on interrupt.

> Is the increase in input/time really unacceptable to you ? 
> The only thing that should take longer is inserting the row
> in the first place. 

Yes, this time is the most important.
This is why I keep NULL in C1 - to make insertion as quick as possible.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-21 Thread jhnlmn
> Ryan Johnson  writes:
> histo = list(conn.execute('select min(rowid) lo, max(rowid) hi, count(*) 
> n from lineitem group by rowid/1' order by lo))
...
> a,b,n = buckets[-1]


Thank you for your reply.
You code is not very correct ("a" always remain 0),
but I got your idea (and tested it).
Bucket creation adds about 50% to the time of update, which is acceptable.

By the way, I also posted my question at
http://stackoverflow.com/questions/17099491/sqlite3-how-to-interrupt-a-long-running-update-without-roll-back

So, if you care, you may wish to post your "bucket" solution there as well,
in case somebody will need it.

> Why not use a trigger on T to update C1 whenever it gets set to NULL?

My case is actually simpler: C1 is NULL only on newly inserted rows
(I should have said this from the beginning).

So, I just realized that if I enable AUTOINCREMENT feature, 
then I do not need to process entire rowid range, 
but only range of rows added after the last update.
This range should not have many holes and I can process this range
10,000 rowids at a time.
This will work until rowid will reach the largest possible integer.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-19 Thread Ryan Johnson

On 19/06/2013 1:41 AM, jhnlmn wrote:

Thank you for your response

Simon Slavin  writes:

UPDATE T SET C1 = calculation(C2) WHERE C1 IS NULL AND rowid < 1

This is the best solution when the table is freshly created
and max(rowid) == number of rows.
But after many deletes and inserts many rowid will be unused,
max(rowid) may grow indefinitely and the number of required updates will be
also indefinite.

So, no perfect solution so far.
Why not use a trigger on T to update C1 whenever it gets set to NULL? Is 
it actually important for the NULL to persist until your next batch 
update comes along?


If a trigger would work, that's by far the cleanest solution, and will 
have almost no impact on concurrency. Otherwise...


Assuming you want to process between 10k and 20k rows at a time, you 
might try something like this (writing in python, you should be able to 
translate it easily to the language you actually use) :


import sqlite3
c = sqlite3.open('my-database.db')
histo = list(conn.execute('select min(rowid) lo, max(rowid) hi, count(*) 
n from lineitem group by rowid/1' order by lo))

buckets,i = [(0,0,0),], 0
while i < len(histo):
a,b,n = buckets[-1]
c,d,m = histo[i]
buckets[-1] = (a,d,n+m)
i += 1
if n+m >= 1
buckets.append((0,0,0))

for lo,hi,n in buckets:
conn.execute('update T set C1=calculation(C2) where C1 is NULL and 
rowid between ? and ?', (lo, hi))


Translated into plain English: count the number of rows in each slice of 
10k, being sure to return slices in order. That query will be fast 
because it should read from your rowid index and has small output size 
(even a billion-row input will only produce 100k rows). Merge too-small 
slices so that each contains somewhere between 10k and 20k-1 rows, then 
run your update query, passing the lower and upper bound of each slice 
to limit where it looks. Again, the index on rowid will enforce the 
range limit without a table scan (but double-check the output of 
"explain query plan").


NOTE: by splitting the transaction, you risk the database changing out 
from under you before the last slice is done. Somebody *could* delete a 
whole chunk of the rowid space, for example, and throw off your 
carefully computed slice sizes. Or they could add rows after you make 
the slices, and those rows would be ignored.  Or they could set C1=NULL 
on rows you already looked at. Most likely, you can squint and claim 
that all those kinds of things just happened "after" the batch update, 
but whether that's allowed is application dependent.


Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-19 Thread Simon Slavin

On 19 Jun 2013, at 6:41am, jhnlmn  wrote:

> Simon Slavin  writes:
> 
>> Do you have an index on T(C1) ?  
>> That should dramatically reduce the search time.
> 
> I tried adding index.
> It caused slow down of the original insert of records to the table by about
> 25%, which is unacceptable to me.
> I also slows down the update because it has to update index as well.
> Note that C1 is not the only column, on which I would like to make updates.
> So, I will have to add several indexes, which will be even worse.

Nevertheless, this is the way the problem should be solved according to the 
design of SQLite.  What you are doing is searching for NULL entries in a table. 
 The way you speed up a search is to create an index ideally suited to the 
search.  And as you can see, it works: time for your update command is reduced 
from 'seconds or even minutes' to 2 seconds.

Is the increase in input/time really unacceptable to you ?  Updates that don’t 
change the value of the fields in the index should not take any longer.  The 
index is updated only if the values in it change.  The only thing that should 
take longer is inserting the row in the first place.  Is a slowdown of even 
100% on inputting new data really unacceptable ?  It should prove, in the long 
run, less inconvenient than the long locked period you are currently trying to 
solve.

You can, of course, do your inputting without the extra index existing, then 
create the index later, at a time suitable for you.  But the job of creating 
the index will block other processes from accessing the table just like the 
UPDATE command you currently do does.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-19 Thread jhnlmn
Thank you for your response

Simon Slavin  writes:
> Another way would be to use the LIMIT clause on UPDATE

Yes, this was the very first thing I tried.
But without index on C1 the time of each query with limit
is not limited. For example on 1 mln rows the first
UPDATE T set C1 = calculation(C2) where C1 is NULL LIMIT 1
takes 1 second, but the last: 4.5 seconds
which is dangerously close to the timeout.

> Do you have an index on T(C1) ?  
> That should dramatically reduce the search time.

I tried adding index.
It caused slow down of the original insert of records to the table by about
25%, which is unacceptable to me.
I also slows down the update because it has to update index as well.
Note that C1 is not the only column, on which I would like to make updates.
So, I will have to add several indexes, which will be even worse.

But the index does, indeed, limits the time of one update query
UPDATE T set C1 = calculation(C2) where C1 is NULL LIMIT 1
to about 2 seconds.

> UPDATE T SET C1 = calculation(C2) WHERE C1 IS NULL AND rowid < 1

This is the best solution when the table is freshly created
and max(rowid) == number of rows.
But after many deletes and inserts many rowid will be unused,
max(rowid) may grow indefinitely and the number of required updates will be
also indefinite.

So, no perfect solution so far.

I wonder whether there is a theoretical possibility
of making commit in response to interrupt or special progress_handler return
value?
I am not asking to implement it in a standard sqlite,
but, may be, I can do it as my own custom patch?
Or is it completely impossible?

Thank you



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to interrupt a long running update without roll back?

2013-06-14 Thread Simon Slavin

On 14 Jun 2013, at 11:29pm, DL  wrote:

> UPDATE T set C1 = calculation(C2) where C1 is NULL
> If table is large, this update may take many seconds or even minutes.
> During this time all other queries on this table fail with database is 
> locked
> after connection timeout expires (currently my timeout is 5 seconds).

Do you have an index on T(C1) ?  That should dramatically reduce the search 
time.

> I would like to stop this update query after, say, 3 seconds,
> then restart it. Hopefully, after several restarts entire table will be 
> updated.
> Another option is to stop this update query before making any other request
> (this will require inter-process cooperation, but it may be doable).

Well, you could increase your timeout:




which will fix the problem one way.  Another way would be to use the LIMIT 
clause on UPDATE, if it’s available to you:



Do an UPDATE LIMIT 1, then check the total_changes() function



and if it’s not zero pause a while for your other processes to do their thing, 
then do another UPDATE LIMIT 1.

If compiling your own SQLite is a problem for you, step through your big table 
a section at a time:

UPDATE T SET C1 = calculation(C2) WHERE C1 IS NULL AND rowid < 1

then pause for a while, then do it again but this time up to row 2, etc..

Simon.


Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users