Hello, Piero Molino wrote: > Ok no one has a solution for my problem ^_^ I hope that Jens will > answer this time becuase he is one of the authors of the article i > cited in the previous message, so he knows best. > > I managed to find the functions i need in the relfinder sourcecode > (they were in the index.php) and i realized how the database query are > done, and as i thought they were practically a series of joins. By the > way i can't get things working because o the statements table: can > someone who tried this come tell me how to construct it? May i use the > dbpedia csv dumps and import them in a mysql table like this: > > ( > `subject` varchar(255) collate latin1_general_ci NOT NULL, > `predicate` varchar(255) collate latin1_general_ci NOT NULL, > `object` varchar(255) collate latin1_general_ci NOT NULL, > `id` int(10) unsigned NOT NULL, > PRIMARY KEY (`id`) > ) > > ? (this is the code of the CopyTable from the relfinder sourcecode)
At the time we wrote the Relationship Finder, the statements table was easy to create. You just had to download the csv file of the DBpedia release and load it into your database. Now things have changed a bit since then. You have to perform slight modifications of the extraction code. I prepared a csv file for you here: http://downloads.dbpedia.org/tmp/infobox.csv.bz2 In a next step, you have to load the data into your DB, which can be done using e.g. this PHP script on the command line: <? $connection = mysql_connect('localhost',$user,$pass,true); mysql_select_db('dbpedia_relfinder', $connection) or die(mysql_error()); mysql_query("DROP TABLE IF EXISTS statements") or die(mysql_error()); mysql_query("CREATE TABLE `statements` ( `id` int(10) unsigned NOT NULL auto_increment, `subject` varchar(255) collate latin1_general_ci NOT NULL default '', `predicate` varchar(255) collate latin1_general_ci NOT NULL default '', `object` text collate latin1_general_ci, `object_is` char(1) collate latin1_general_ci NOT NULL default '', PRIMARY KEY (`id`), KEY `s_sub_pred_idx` (`subject`(200)), KEY `s_pred_idx` (`predicate`(200)), KEY `s_obj_idx` (`object`(250)) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;" ) or die(mysql_error()); mysql_query("LOAD DATA LOCAL INFILE 'infobox.csv' IGNORE INTO TABLE statements") or die(mysql_error()); ?> The second step is computing the components of the RDF graph. To do this, you have to execute cluster_main.php on the commandline. This can take hours or even days depending on your machine. Regarding the queries, you are right that they are basically joins. We can very easily detect whether two resources are in the same component of the graph and - as you read in the paper - we can also efficiently give a minimum and maximum value for the distance between two resources. The hard part is to detect the exact distance. Using MySQL, we found that joins performs quite reasonable if the distance is below 8 (if I remember correctly). In the meantime I have also seen other (relatively recent) approaches to compute the distance between resources, which are particularly targeted at large graphs. However, I do not have a handy literature reference available. Currently, we are thinking about reviving the DBpedia Relationship Finder and are looking at ways to provide this tool without the involved maintenance overhead of keeping it up-to-date. This means that we will probably use SPARQL queries against Virtuoso. This approach works well for distances up to 3/4. (You can try SPARQL queries against the official DBpedia endpoint to test this.) Kind regards, Jens -- Dipl. Inf. Jens Lehmann Department of Computer Science, University of Leipzig Homepage: http://www.jens-lehmann.org GPG Key: http://jens-lehmann.org/jens_lehmann.asc ------------------------------------------------------------------------------ Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
