I'm sure a lot of you noticed the new IMDb feature: character pages.
You can see a complete description in this thread (requires a free
registration):
  http://akas.imdb.com/board/bd0000040/reply/86514303

And an example of these pages here:
  http://www.imdb.com/character/ch0000001/

As you can see, it's a really cool feature, and I'm wondering if it
should be supported by IMDbPY (and if it's possible at all, for
instance).

Looking at the overall design, this means a new Character class,
almost identical to the Person class.
Instances of Person class will change, and the person.currentRole
will be no more a unicode string, but a Character instance.
Similarly, instances of Character will have a character.currentPlayer
(the name of the attribute is still to be defined) which will be
a Person instance.

So far - more or less - so good: a bit of refactoring but nothing
impossible.

It's easy to support the data for the 'http', 'httpThin' and 'mobile'
data access systems; the real problems come with 'local' and 'sql'.
I assume there are no plans to include these information in the
plain text data files, so our options are limited.

We can completely ignore the support for character: Person instances
will still have a Character object in person.currentRole, but there
will be no imdb_access.get_character() and imdb_access.search_character()
methods, and imdb_access.update(character) will do nothing or raise
an exception.
Honestly I _really_ dislike such a "solution": IMDbPY always was, as
long as the information were there, _independent_ by the source of
the data, and this would be a major departure from this spirit.

[local]
Taking a fast look at the 'local' support: the actors.list and
actresses.list need to be parsed to create some files:
- characters.index: map a "characterID" (equivalent of a personID)
    to an offset in the characters.data file.
- characters.data containing fields in a format like this:
    CHARACTER_ID: the ID of this character.
    BIT: a bit, 0 for actors.list and 1 for actresses.list.
    MOVIE_ID: a movie in which the characters appeared.
    PERSON_ID: useful to easily retrieve the person name playing this
               character in this movie.
    TXT: not sure if it's needed, to store notes and so on.
- characters.key: map character name to CHARACTER_IDs, needed by
    the imdb_access.search_character() method.

No support for "alternate names" or "biographies" of characters,
since these information are not in the database.
_Every_ character _ever_ played will have its characterID and
entry in any of the above files; I'm not sure how big they will be.
And the support to get the "real" characterID from the IMDb site
will be limited: many characters will not have their characterID
on the web pages, while others may be listed under an alternate name.

[sql]
Similar to 'local': we need to parse actors.list and actresses.list
at insert time, and add a "character" table.
The personRole column of the castInfo table can become a foreign
key pointing to the character, but it may also remain a string
(retrieving the characterID at access-time).
Another table, character_info, will be required to map a character
to the movies in which it was present and the personID of the
actor/actress.


I've for sure missed something; it looks like a big task, and
I'm not sure it's worth the effort.
What do you think about it?  Any volunteers? :-)

I'll start now, writing a script to count how many single
characters appear in actors.list and actresses.list files -
this is probably an important information (and I'll know if
these info can be extraced easily/quickly enough).


-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to