On Nov 20, Joachim Selke <[EMAIL PROTECTED]> wrote:

> > Fixed.  Now it's all to see if IMDbPY has decent performances
> > on a DB2 server... :-)
> 
> After running for about 8 hours,

Ohhh... a little too much. :-(
It should be _all over_ in 4 or 5 hours, even on an outdated/busy computer.

May be you can gain some (a lot?) of speed using transactions; you
can try the imdbpy2sql.py option "--sqlite-transactions": it was
thought for SQLite (and will issue a warning if used with another
server, but ignore it).

> the script runs into an error:
  [...]
> ibm_db_dbi.DataError: ibm_db_dbi::DataError: Statement Execute Failed:
> [IBM][CLI Driver] CLI0109E  String data right truncation. SQLSTATE=22001
> SQLCODE=-99999

I have a very clear idea of what's going on.
Using SQLObject (and this excludes DB2, btw) you can create indexes
even on TEXT columns, and they will be arranged in a way or another (also
following instructions about how many chars have to be considered,
creating the index).
With SQLAlchemy this is not possible, and you'll end up with an
exception, because many db servers cowardly refuse to create indexes
on a whole (possibly of infinite length) TEXT column.

IMDbPY's solution?  Using SQLAlchemy, some TEXT columns (the ones
which require an index) are VARCHAR(255) - and so the indexes are
created and everybody is happy.
Sort of: there are very few cases (at this time, 3 - THREE - in the whole
database, and they are all person's "names") in which the data is
too long.
The good news: every database server seem to truncate the value,
issue a warning a go on.
The bad news: this was until now. :-(

Here we are: DB2 dies of a horrible death, instead of truncate
the data.
If you ask it to me, it's a bug of the ibm_db driver, but they may
have their reasons to do that.

Keeping in mind that I think it's an ibm_db bug, I'm open to
hints about how to handle the situation; I can think of:

- find a way to create indexes - using SQLAlchemy - on every server
  for TEXT columns.
- shrink every VARCHAR text to 253 chars (crazy: it's not needed for
  SQLObject, where tables are TEXT, and will lose data and affect
  performances).
- catch exceptions raised at insert-time, and issue a warning about
  it.  This can (and probably will) be done anyway to make the script
  more solid, but won't be a great solution: A LOT of data will be lost,
  in this case.

Other ideas?

PS for Joachim: I really appreciate your valuable help, but as you
can imagine, the problem is becoming more and more academic. :-)
In the sense: actually the solution _for you_ is to use another
database server (maybe you can migrate the data, later, to DB2).
Honestly, ibm_db seems too immature, and IMDbPY have a lot of
rough edges by itself (as you've seen).
Obviously I'll be more than glad to continue to seek for a work-around
for this kind of funny problems, but waiting for a solution to all these
bug is not the best way to solve your situation. :-)


Thanks!
-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to