Re: [Imdbpy-help] imdbpy to mysql help
On Sun, Feb 24, 2013 at 12:32 AM, D L dlm...@hotmail.com wrote: Ok, well here's an update. I just let the foreign keys run for a little over a full day and it actually completed for mysql: # TIME FINAL : 1883min, 1sec (wall) 23min, 57sec (user) 0min, 5sec (system) I see. I've just run it with a subset of the db (1% taken from each file) and my numbers are: # TIME TOTAL TIME TO INSERT/WRITE DATA : 12min, 18sec (wall) 5min, 23sec (user) 0min, 43sec (system) building database indexes (this may take a while) # TIME createIndexes() : 1min, 25sec (wall) 0min, 0sec (user) 0min, 0sec (system) adding foreign keys (this may take a while) # TIME createForeignKeys() : 10min, 2sec (wall) 0min, 0sec (user) 0min, 0sec (system) RESTORING imdbIDs values for movies... DONE! (restored 0 entries out of 0) # TIME restore movies : 0min, 0sec (wall) 0min, 0sec (user) 0min, 0sec (system) RESTORING imdbIDs values for people... DONE! (restored 0 entries out of 0) # TIME restore people : 0min, 0sec (wall) 0min, 0sec (user) 0min, 0sec (system) RESTORING imdbIDs values for characters... DONE! (restored 0 entries out of 0) # TIME restore characters : 0min, 0sec (wall) 0min, 0sec (user) 0min, 0sec (system) RESTORING imdbIDs values for companies... DONE! (restored 0 entries out of 0) # TIME restore companies : 0min, 0sec (wall) 0min, 0sec (user) 0min, 0sec (system) # TIME FINAL : 23min, 45sec (wall) 5min, 23sec (user) 0min, 43sec (system) What kind of CPU/RAM/disk have you used? One of my main questions right now is the difference in results between the web search and the sql search. For example, if I ran a search on all the movies that Denzel Washington has acted in via the web search, it basically outputs all the main ones, Yep, they are just grouped in a different way. It would be not easy for us (even if it's not impossible, I guess) to identify alle the various categories used on the web and the rules used to categorize the movies, but... For the moment, I think you could take the whole filmography and search for tv series and/or movies in which an actor is playing Himself (or anything that starts with Himself/Herself/Themselves) And I haven't tested it that much, but it appears that sqlite and mysql have roughly the same speeds in running these queries, but I'm not completely sure yet. I expect them to be comparable in speed, but not to be slower than a web search. :-/ -- Davide Alberani davide.alber...@gmail.com [PGP KeyID: 0x465BFD47] http://www.mimante.net/ -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help
Re: [Imdbpy-help] imdbpy to mysql help
Ok, well here's an update. I just let the foreign keys run for a little over a full day and it actually completed for mysql: # TIME FINAL : 1883min, 1sec (wall) 23min, 57sec (user) 0min, 5sec (system) One of my main questions right now is the difference in results between the web search and the sql search. For example, if I ran a search on all the movies that Denzel Washington has acted in via the web search, it basically outputs all the main ones, whereas if I do it via the sql search it will include a lot of random stuff like award ceremonies and random tv shows that he may have had a cameo on. How would I make the sql search more like the web search so that it excludes stuff like award ceremonies and only outputs the main movies? And I haven't tested it that much, but it appears that sqlite and mysql have roughly the same speeds in running these queries, but I'm not completely sure yet. From: dlm...@hotmail.com To: davide.alber...@gmail.com; imdbpy-help@lists.sourceforge.net Subject: RE: [Imdbpy-help] imdbpy to mysql help Date: Fri, 22 Feb 2013 00:10:28 -0800 So after updating those dependencies, the MySQL still gets stuck on the foreign keys section, however sqlite actually manages to finish. But one of my concerns is that even the requests with sqlite can be slow the first time, and on occasion the web access was a lot faster than using the sqlite. For example, the search_person script is faster via the web, but if I run it twice (searching the same person) using the sql database, the 2nd time is noticeably much faster, most likely due to the data already being cached. My question is how fast does something like search_person take on MySQL (if I can eventually get it to work), since using sqlite seems like it's slower than just going the web route so far. From: dlm...@hotmail.com To: davide.alber...@gmail.com Subject: RE: [Imdbpy-help] imdbpy to mysql help Date: Tue, 19 Feb 2013 17:54:16 -0800 Date: Tue, 19 Feb 2013 21:28:18 +0100 Subject: Re: [Imdbpy-help] imdbpy to mysql help From: davide.alber...@gmail.com To: dlm...@hotmail.com CC: imdbpy-help@lists.sourceforge.net On Sun, Feb 17, 2013 at 11:45 PM, D L dlm...@hotmail.com wrote: Yeah tried that and ran it overnight, still no luck - it gets stuck on the foreign keys part. I'm just trying this on my laptop, so I may just proceed with using the web access for the data. Once I get everything set up for a web hosting, I may try other databases such as sqlite to see if that works. D'oh! :( Versions of: - IMDbPY - SQLAlchemy - SQLObject - MySQL - python-mysqldb - python-migrate ? IMDbPY - 5.0dev20130210 SQLAlchemy - 0.8.0b2 SQLObject - 1.3.2 MySQL - Server version: 5.5.29-0ubuntu0.12.04.1 (Ubuntu) python-mysqldb - 1.2.3 python-migrate - 0.7.2 Both my python-mysqldb and python-migrate were older versions, which I just updated as I typed this. I tried the process with sqlite a night ago and it was stuck on the foreign keys section as well, I will try it again now that mysqldb and migrate have been updated and hopefully it will work. I also wrote a rough script for the data retrieval using the webaccess method, and you're right it does take a while. Anyway, if you interrupt it while it's creating the foreign key, maybe you can try to see which were already created, and add the missing one following the scheme you can find in imdb/parser/sql/dbschema.py Anyway, obviously I'll try to reproduce the problem, since it's not nice at all. :-/ Hopefully, the updated mysqldb and migrate would fix it, but we'll see. -- Davide Alberani davide.alber...@gmail.com [PGP KeyID: 0x465BFD47] http://www.mimante.net/ -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb___ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help