Hi Camille,
Your idea to import the data you need into a SQL database sounds good,
and IMDbPY could help.

There are, however, some caveats: first of all, the document you found
(README.sqldb.txt)
refers to an obsolete set of data no longer updated by IMDb since some years.
A new, up-to-date, dataset exists ( https://www.imdb.com/interfaces/ )
and IMDbPY is able to import it
into a SQL database of your choice; you can find the documentation
here: https://imdbpy.readthedocs.io/en/latest/usage/s3.html

But you may face another problem: IMDb includes very little
information, in this new dataset.
Look at it, and decide if it's okay for your project.
If it is, you can proceed.

More or less, the workflow would be as follow:

1.
install the latest version of https://github.com/alberanid/imdbpy/ -
see https://imdbpy.readthedocs.io/en/latest/#installation

2.
Download the dataset.
You can do it manually or, if you prefer, you can use the
"download-from-s3" script you'll find in the docs/goodies directory
(it requires a Linux system)

3.
import the dataset; as an example, to import the data into a SQLite
database, you can do something like:
s32imdbpy.py /path/to/the/imdb-dataset-2020-07-17/ sqlite:///imdb.db --verbose

(notice the three / in sqlite:///imdb.db - they are all needed)
After a while, you will have an "imdb.db" file in the current
directory, containing the imported data.

4.
you can now search and analyze the data in this file using the
Python's "sqlite3" module.


Let me know if you have questions or something is not clear.

Hope this helps.

On Fri, Jul 17, 2020 at 3:33 PM Camille Sanchez
<camille.sanche...@gmail.com> wrote:
>
> Hi team,
>
> For a python class, I am trying to sort the IMDb films through different 
> criteria (such as title types, genre, keywords, plot, etc. - similar to this 
> page of the IMDb website) and get a list of movies that match these criteria. 
> However, I am not quite sure how to start, could you help? I have downloaded 
> IMDbPy but I am not sure what to do next.
>
> I saw on different post such as "Retrieving all movies and csv file" and " 
> Retrieving a List of Movies in a Given Year" that it is possible to retrieve 
> the data without having a specific movie ID or movie title.
>
> Logically, I would assume that it is easy to query the data from a database. 
> I have done it on smaller projects with SQL database for instance, but I am 
> not sure I understand what is the best approach to do such a task. Is it:
> - to store all the movies into a server like AWS and then do the searches 
> directly from it
> - to access directly the database using this README.sqldb.txt file
> Or is there another way?
>
> Are there steps anywhere I can follow? Sorry I am a beginner at all this.
>
> Thank you for your help!
>
> Best.
> Camille
>
>
> _______________________________________________
> Imdbpy-help mailing list
> Imdbpy-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/imdbpy-help



-- 
Davide Alberani <davide.alber...@gmail.com>  [PGP KeyID: 0x3845A3D4AC9B61AD]
http://www.mimante.net/


_______________________________________________
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help

Reply via email to