Do you have appropiate indexes for it? Antony T Curtis On Feb 26, 2013 6:00 PM, "Arjen Lentz" <ar...@openquery.com> wrote:
> Hmm somehow a typo (or old syntax) was still present in the SQL file. > > $ zcat actors.list.gz | pv | ./imdb2sql | mysql -u root > --socket=/tmp/mysql.sock test > 725MB 0:47:17 > loading the actors requires 260263 pages (or K) in the keybuffer. > > $ zcat actresses.list.gz | pv | ./imdb2sql | mysql -u root > --socket=/tmp/mysql.sock test > 427MB 0:24:42 > After this, 380564 MyISAM index pages in the key buffer. > > So key_buffer_size=400M for the above two operations is sufficient. > > mysql> insert into actors_movies_actors select > am1.actor_movie_id,am2.actor_id from actors_movies am1 join actors_movies > am2 on (am1.movie_id=am2.movie_id AND am1.actor_id != am2.actor_id); > Query OK, 169301244 rows affected (1 hour 39 min 7.58 sec) > Records: 169301244 Duplicates: 0 Warnings: 0 > > MariaDB [test]> insert ignore into actors_graph_base (origid,destid) > select am.actor_id,ama.actor_id from actors_movies_actors ama join > actors_movies am using (actor_movie_id); > Query OK, 149853110 rows affected, 65535 warnings (1 hour 51 min 19.86 sec) > Records: 169301244 Duplicates: 19448134 Warnings: 19448134 > > Ok so a simple actor -> actor graph is 149 million vertices. > > Somehow the system doesn't like the actors_graph table, getting errors > (not found, error 2) no matter what I do. When I name it ag it's fine, with > the basetable named agb. > Single underscores have been used in other tests, so I dunno what's going > on there. > > > ----- Original Message ----- > > From: "Arjen Lentz" <ar...@openquery.com> > > To: oqgraph-dev@lists.launchpad.net > > Sent: Tuesday, 26 February, 2013 10:06:51 AM > > Subject: [Oqgraph-dev] IMDB tools > > Hi all > > > > Here is my old IMDB conversion tool (C code), and the schema I used. > > It uses only Actors and Movies, filtering out TV series and DVDs. > > > > 1484084 actors \_ actors > > 866799 actresses / > > 574006 movies > > 5900319 actors_movies > > > > actors.list.gz and actresses.list.gz files from > > ftp://ftp.fu-berlin.de/pub/misc/movies/database/ > > (via http://www.imdb.com/interfaces) > > > > My laptop is still building the actors_movies_actors table so I don't > > yet have a total for that, but from memory I think it was around 45 > > million edges. > > > > Currently I load using MyISAM with delay key writes and large key > > buffer size. > > We can possibly optimise the load methodology a bit further. > > > > > > Cheers, > > Arjen. > > -- > > Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) > > Australian peace of mind for your MySQL/MariaDB infrastructure. > > > > Follow us at http://openquery.com/blog/ & http://twitter.com/openquery > > > > > > -- > > Mailing list: https://launchpad.net/~oqgraph-dev > > Post to : oqgraph-dev@lists.launchpad.net > > Unsubscribe : https://launchpad.net/~oqgraph-dev > > More help : https://help.launchpad.net/ListHelp > > -- > Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) > Australian peace of mind for your MySQL/MariaDB infrastructure. > > Follow us at http://openquery.com/blog/ & http://twitter.com/openquery > > > -- > Mailing list: https://launchpad.net/~oqgraph-dev > Post to : oqgraph-dev@lists.launchpad.net > Unsubscribe : https://launchpad.net/~oqgraph-dev > More help : https://help.launchpad.net/ListHelp > >
-- Mailing list: https://launchpad.net/~oqgraph-dev Post to : oqgraph-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~oqgraph-dev More help : https://help.launchpad.net/ListHelp