Kevin O'Gorman wrote:
> I wrote a super simple program to the read the file and count how many
> records are already there.  I got impatient waiting for it so I killed
> the process and added an output of one dot (".") per 1000 records.  It
> went very fast for what I estimate was around 200 dots and hit a wall.
> It made progress, but very very slowly.  [...]
> The field being used for the lookup has an index.

I'd guess that most records are found, and that the file and the table
happen to be somewhat sorted.  The search becomes slow when the amount
of data that needs to be read exceeds the available memory.

> Why does commit() make a difference?

Hmmm ... interesting.

>             for row in conn.execute("""
>                         SELECT pnum
>                         FROM pos
>                         WHERE ppos=?
>                     """,(pos,)):
>                 pnum = row[0]
>                 break
>             if pnum is None:
>                 missing += 1
>             else:
>                 present += 1

Even with the index on ppos, the DB still has to look up the table row
to read the pnum value.

You do not care about the actual pnum value, so you could replace it
with a constant value ("SELECT 1 FROM ...").  Or just use EXISTS to
show what you actually want to do:

        cursor = conn.execute("""
                SELECT EXISTS (
                        SELECT *
                        FROM pos
                        WHERE ppos = ?)
                """, (pos,))
        exists = cursor.fetchone()[0]
        if exists:
            present += 1
            missing += 1

sqlite-users mailing list

Reply via email to