Hi David,
Davide Alberani wrote:
On Apr 03, Mark Armendariz <[EMAIL PROTECTED]> wrote:
I've just joined this list. I found this app this morning and
downloaded the imdb database last night. While converting the data, I
was getting an error stating that some data was too long for the
person_info.info field in the biographies list.
Whoah! :-)
I'll be unable to download an updated version of the plain text
data files for another week or two, so I can't check this right now.
fortunately, you'll only need the person info file to test. =)
This patch changes the person_info.info field from a text to a
mediumtext field.
I'm a bit reluctant to change the type of a field: I have to
consider how BLOBCol works with other (non-MySQL) databases and
its behavior regarding non-ascii chars (usually stored as utf8 -
but it's not a requirement - and retrieved/internally managed by
IMDbPY as unicode strings).
I was trying to figure out the best way to make a minimal change to the
code. All I really wanted to do was use a MEDIUMTEXT instead of a plain
TEXT field so it could fit the data properly. I would have been fine
finding a way to suppress the error - but thought it might be better to
resolve it instead.
I tried the sqlobject documentation, but didn't find what I needed. So,
I cracked open the sqlobject code and found that the "UnicodeCol" type
can only make TEXT and VARCHAR fields. I dug further to find the
'proper' way to make a MEDIUMTEXT.
In SOBlobCol (which is the meat of BlobCol), I found this:
----- Snip -----
def _mysqlType(self):
length = self.length
varchar = self.varchar
if length >= 2**24:
return varchar and "LONGTEXT" or "LONGBLOB"
if length >= 2**16:
return varchar and "MEDIUMTEXT" or "MEDIUMBLOB"
if length >= 2**8:
return varchar and "TEXT" or "BLOB"
return varchar and "TINYTEXT" or "TINYBLOB"
----- /Snip -----
Which essentially meant to me that if I set it to varchar=True it would
use a TEXT field, and if I set the length anywhere between 2**16 and
2**24 - 1, it would give me a MEDIUMTEXT field, which, fortunately, is
not a BLOB column - seems to me that this should probably be in
UnicodeCol instead, but I digress.
After making this change, I was able to successfully convert the text
files into a mysql database.
That's a bit strange - can you help me on this?
What's your db engine (InnoDB, MyISAM, etc)?
The 'person_info.info' column created by an _unpatched_ version of
IMDbPY, what kind of field (TEXT, VARCHAR, ...) creates?
Can you insert a text longer than 2**16-1 bytes in a TEXT field,
on your system? On mine it's possible - and the value is silently
truncated.
Can you scan the database you've populated with your patched IMDbPY?
I'm interested in the longest strings:
SELECT LENGTH(info), info, id FROM person_info ORDER BY LENGTH(info) DESC
LIMIT 0, 10
I'm using Mysql 5.0.45-community with InnoDB tables. Here's the
response from your query:
LENGTH(info) info id
77754 Susan Atkins, a.k.a. Sadie Mae Glutz, is one of th... 68871
55448 One of seven children, Frank Capra was born on May... 259006
40908 Maurice Tourneur, the film director and screenwrit... 1675285
36290 Anna May Wong, the first Chinese-American movie st... 1814292
35348 Christopher Dusendschön is a Filmmaker and a Holly... 472793
34707 Maurice Rapf, the Hollywood screenwriter who becam... 1370052
31799 John Hay "Jock" Whitney, the multi-millionaire spo... 1785987
30605 Emilio "El Indio" Fernández Romo, who was born on ... 529248
30539 Marlon Brando is widely considered the greatest mo... 195182
29863 What do the classic and near-classic films _Dawn P... 714084
It seems to be only that first column that stretched beyond the limit of
a TEXT field.
+ info = BLOBCol(notNone=True, varchar=True, length=2**16)
I'm not a database expert, but I think a TEXT field can contains
2**16-1 bytes; here you're creating a MEDIUMTEXT (or MEDIUMBLOB?)
just a byte longer... how can it be enough?
Well, you can't set a length on a TEXT field (including TINYTEXT,
MEDIUMTEXT, LONGTEXT, etc). The length limit is inherent to mysql.
MEDIUMTEXT will allow 2**24 - 1
Thank you very much for your support!
No, Thank You!!
Mark
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel