That's a fair statement.
Indices not only increase the size of the data, they also slow down db
writes, because the index tables need to be written as well as the
data tables.
Still, you want to avoid a full table scan on a table with millions of
records.
Sqlite also has an explain command, just
If processing one row at a time works, then go for it. If you want to try
more at a time, use Massimo's approach and just make the number smaller than
1000.
Pl. beware of indexing.
It is a double-edged sword.
Indexing too many columns would increase the data-size.
One can first run a query with "explain extended" clause (in MySQL
database).
That can tell which column(s) would need an index.
On Jul 23, 6:29 pm, Cliff wrote:
> You have exposed two rel
You have exposed two relatively advanced programming topics: code
profiling and database performance tuning.
Because I am a relative noobie to both Python and Sqlite, I cannot
unfortunately give you specific directions.
But I can offer an approach you might try. Maybe you should first
learn wher
Ah !
The database-tier did not flash into my mind.
I have not used sqlite anytime.
I am using MySQL; the table structure is partially normalised,
suitable indexed.
Because of these optimizations, whenever I query a huge table (>1
million rows), I get very fast response.
I don't know whether any she
Thanks, Vineet! Lot's of good info there!
I don't have actual code yet, because I couldn't even get the db queries to
work in a reasonable amount of time.
The little code I showed in my initial post already runs slow ( my DB
records are db.itemEntry, not db.table ...).
The slowness (so far)
Thanks, Massimo!
Even if I grab 1000 at a time (which is definitely better to do!!),
I still have to wait for minutes before I get rows back!
However, I have found that this is 'fast' :
n = db.table.count() # not sure if that's the syntax, but somehow find out
how many records in table
for i
You never want to do this:
n = q.count()
s = db(q)
for i in range(1,n):
r = s.select( limitby=(i,1)).first()
# do something with r
instead you consider something like this:
i, m = 0, 1000
while True:
rows = db(q).select(limitby=(i*m,(i+1)*m))
for r in rows:
# do something with r
Luis,
The problem of churning huge data can be tackled very well in Python.
The task can be very slow if we use wrong methods; at the same time,
it can be amazingly fast by using the correct mix of methods in
Python.
(e.g. 'for' loops cause a large overhead).
I have gone through the same situation
9 matches
Mail list logo