On 2014-05-22, Tim Chase wrote: > On 2014-05-22 12:47, Adam Funk wrote: >> I'm using Python 3.3 and the sqlite3 module in the standard library. >> I'm processing a lot of strings from input files (among other >> things, values of headers in e-mail & news messages) and suppressing >> duplicates using a table of seen strings in the database. >> >> It seems to me --- from past experience with other things, where >> testing integers for equality is faster than testing strings, as >> well as from reading the SQLite3 documentation about INTEGER >> PRIMARY KEY --- that the SELECT tests should be faster if I am >> looking up an INTEGER PRIMARY KEY value rather than TEXT PRIMARY >> KEY. Is that right? > > If sqlite can handle the absurd length of a Python long, you *can* do > it as ints:
It can't. SQLite3 INTEGER is an 8-byte signed one. https://www.sqlite.org/datatype3.html But after reading the other replies to my question, I've concluded that what I was trying to do is pointless. > >>> from hashlib import sha1 > >>> s = "Hello world" > >>> h = sha1(s) > >>> h.hexdigest() > '7b502c3a1f48c8609ae212cdfb639dee39673f5e' > >>> int(h.hexdigest(), 16) > 703993777145756967576188115661016000849227759454L That ties in with a related question I've been wondering about lately (using MD5s & SHAs for other things) --- getting a hash value (which is internally numeric, rather than string, right?) out as a hex string & then converting that to an int looks inefficient to me --- is there any better way to get an int? (I haven't seen any other way in the API.) -- A firm rule must be imposed upon our nation before it destroys itself. The United States needs some theology and geometry, some taste and decency. I suspect that we are teetering on the edge of the abyss. --- Ignatius J Reilly -- https://mail.python.org/mailman/listinfo/python-list