Yeah, the shell scripting always seems to be the best way.
Here are some scripts, which I wrote a while ago and which get all the
data for one or several classes from DBpedia:
http://code.google.com/p/aksw-commons/source/browse/#hg%2Fscripts%2Fdbpediadomain
e.g.
./download_one_domain.sh Actor
http://downloads.dbpedia.org/3.6/en/short_abstracts_en.nt.bz2
gets all short abstracts for all instances of class Actor
It uses the php-cli clean.php script to do the finishing mentioned by Pablo.
It is not optimal, but it kind of works and you should be able to adjust
it to your needs.
All the best,
Sebastian
Am 30.06.2011 10:46, schrieb Pablo Mendes:
Yes.
Use "wget" to download the NT files, then use "grep" to get only the
Film instances, "cut" to extract the URIs and creativity to finish the
task. :)
Without testing, a 2 min starter-kit draft is here:
wget http://downloads.dbpedia.org/3.6/en/instance_types_en.nt.bz2
bunzip2 http://downloads.dbpedia.org/3.6/en/instance_types_en.nt.bz2
grep "Film" instance_types_en.nt > film_instances_en.nt
cut -f 1 film_instances_en.nt > film_uris.set
Now use this file to pull other data from infobox properties and
categories for example.
Cheers,
Pablo
On Thu, Jun 30, 2011 at 8:22 AM, sareh aghaei <[email protected]
<mailto:[email protected]>> wrote:
Hi,
I need to have all data about dbpedia films for my work. Time
complicity of my algorithm is high and I can not keep connecting
to dbpedia sparql endpoint during of program running, so I decided
to fetch dbpedia film data by sparql queries and save them on a
rdf file in some repetitions! but I faced to some errors because
server is busy and etc.
Is there any suggestion to fetch all data about films of dbpedia?
Thanks a lot,
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously
valuable.
Why? It contains a definitive record of application performance,
security
threats, fraudulent activity, and more. Splunk takes this data and
makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion