On Tue, 2017-04-11 at 07:39 -0400, Richard Hipp wrote:
> On 4/11/17, Reid Thompson <reid.thomp...@ateb.com> wrote:
> > On Mon, 2017-04-10 at 21:39 +0100, Simon Slavin wrote:
> > > On 10 Apr 2017, at 9:28pm, Reid Thompson <reid.thomp...@ateb.com> wrote:
> > > 
> > > > my questions are, if I prepare and utilize a statement for a result set
> > > > in the tens of thousands (or more) using a where clause along the lines
> > > > of
> > > >    "ORDER BY batch_id, due_datetime, random()"
> > > > 
> > > > 1) As I sqlite3_step through the result set, am I guaranteed to get
> > > > each
> > > > row only once?
> > > 
> > > Yes.  _step() will return each row exactly once.  This assumes you will
> > > not make any changes to the table those rows are in until you have
> > > finished stepping.  If you make any changes to the table
> > > before the last call to _step() then things may get more complicated.
> > 
> > 
> > I have a manager process that manages the above step'ing. It gathers
> > records in batches of 30, marks them as claimed, and forwards those
> > batches to one of 50 worker processes. Each worker process performs work
> > based on each records data and when complete provides that information
> > back to the manager along with the record data.  The manager process
> > uses the returned information to update the records.  When a worker
> > completes a batche, the manager process sends them another batch. So
> > in most cases, I'm performing two updates to the record while step'ing
> > through the result set.
> 
> If you make changes to the table being queried in the middle of the
> query, then there are no guarantees.
> 
> In your case, you are *probably* ok (for now) because the ORDER BY is
> probably not being accomplished using an index.  But if batch_id and
> due_datetime are both NOT NULL and unique, then the third ORDER BY
> term will be ignored, and the UNIQUE index will be used to order the
> output.  And in that case, if any of your worker threads modify the
> batch_id or due_datetime fields, then there could be issues.
> 
> A safe work-around is to run the query into a TEMP table:
> 
>    CREATE TEMP TABLE work_to_do AS SELECT .....
> 
> Then query the work_to_do table for stuff to be modified.

Thank you.  I'll make adjustments to my process.  One follow on
question.  This would be a concern regardless of whether random() is
used or not in the ORDER BY clause? 

reid
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to