Yeah, the shell scripting always seems to be the best way.
Here are some scripts, which I wrote a while ago and which get all the data for one or several classes from DBpedia:
http://code.google.com/p/aksw-commons/source/browse/#hg%2Fscripts%2Fdbpediadomain
e.g.
./download_one_domain.sh Actor http://downloads.dbpedia.org/3.6/en/short_abstracts_en.nt.bz2
gets all short abstracts for all instances of class Actor

It uses the php-cli clean.php script to do the finishing mentioned by Pablo.

It is not optimal, but it kind of works and you should be able to adjust it to your needs.
All the best,
Sebastian


Am 30.06.2011 10:46, schrieb Pablo Mendes:
Yes.

Use "wget" to download the NT files, then use "grep" to get only the Film instances, "cut" to extract the URIs and creativity to finish the task. :)

Without testing, a 2 min starter-kit draft is here:

wget http://downloads.dbpedia.org/3.6/en/instance_types_en.nt.bz2
bunzip2 http://downloads.dbpedia.org/3.6/en/instance_types_en.nt.bz2

grep "Film" instance_types_en.nt > film_instances_en.nt
cut -f 1 film_instances_en.nt > film_uris.set

Now use this file to pull other data from infobox properties and categories for example.

Cheers,
Pablo

On Thu, Jun 30, 2011 at 8:22 AM, sareh aghaei <[email protected] <mailto:[email protected]>> wrote:


     Hi,

    I need to have all data about dbpedia films for my work. Time
    complicity of my algorithm is high and I can not keep connecting
    to dbpedia sparql endpoint during of program running, so I decided
    to fetch dbpedia film data by sparql queries and save them on a
    rdf file in some repetitions! but I faced to some errors because
    server is busy and etc.
    Is there any suggestion to fetch all data about films of dbpedia?

    Thanks a lot,


    
------------------------------------------------------------------------------
    All of the data generated in your IT infrastructure is seriously
    valuable.
    Why? It contains a definitive record of application performance,
    security
    threats, fraudulent activity, and more. Splunk takes this data and
    makes
    sense of it. IT sense. And common sense.
    http://p.sf.net/sfu/splunk-d2d-c2
    _______________________________________________
    Dbpedia-discussion mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion



------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2


_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to