On 05/13/2011 05:03 PM, Alan Gauld wrote:
As I say, just some thoughts,


I *am* curious, Alan, whether you or anyone else on the list are able to help me make this a little more efficient:

                cur.execute("SELECT short_url,long_url FROM short_urls")
                giant_list = cur.fetchall()

                for i in giant_list:
                        if i[0]:
                                if i[1]:
                                        mc.set(i[0], i[1])


At present, we have about two million short URL's in our database, and I'm guessing there's a much smoother way of iterating through 2M+ rows from a database, and cramming them into memcache. I imagine there's a map function in there that could be much more efficient?

v2 of our project will be to join our short_urls table with its 'stats' table counterpart, to where I only fetch the top 10,000 URLs (or some other smaller quantity). Until we get to that point, I need to speed up the restart time if this script ever needs to be restarted. This is partly why v1.5 was to put the database entries into memcache, so we wouldn't need to reload the db into memory on every restart.

Thanks,
Ian

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to