Hi Hady,

parsing turtle manually can be painful and error prone ;)
Consider using a library for this or convert them to nt or nt-style first.

There are the options I know but the list may not be complete
any23, jena, openrdf, rapper(cli)

Dimitris


On Fri, Aug 9, 2013 at 2:20 PM, Hady elsahar <[email protected]> wrote:

> Hello Dimitris ,
>
> the wda code offers and option  'turtle-links' that extracts only the
> language links  in the form:
>
> <http://ilo.wikipedia.org/wiki/Republika%20iti%20Irl%C3%A1nda>
>         a       so:Article ;
>         so:about        w:Q27 ;
>         so:inLanguage   "ilo" .
> <http://gl.wikipedia.org/wiki/Irlanda>
>         a       so:Article ;
>         so:about        w:Q27 ;
>         so:inLanguage   "gl" .
>
> the other option 'turtle'  extracts the whole dump
> can we use first option extracted file it will be a lot faster
>
>
> the LLextraction code is written for the .nt format , so i'll adapt it for
> the turtle then run it on lgd server
>
> thanks
> Regards
>
> On Sat, Aug 3, 2013 at 6:25 PM, Dimitris Kontokostas <[email protected]>wrote:
>
>> Hi Hady,
>>
>> This might be what we were waiting for :)
>> If noone else objects, can you create a turtle dump and re-test / adapt
>> your existing ILL code?
>> Afterwards we can start the mappings process
>>
>> Best,
>> Dimitris
>>
>>
>> ---------- Forwarded message ----------
>> From: Markus Krötzsch <[email protected]>
>> Date: Sat, Aug 3, 2013 at 4:48 PM
>> Subject: [Wikidata-l] Wikidata RDF export available
>> To: "Discussion list for the Wikidata project." <
>> [email protected]>
>>
>>
>> Hi,
>>
>> I am happy to report that an initial, yet fully functional RDF export for
>> Wikidata is now available. The exports can be created using the
>> wda-export-data.py script of the wda toolkit [1]. This script downloads
>> recent Wikidata database dumps and processes them to create RDF/Turtle
>> files. Various options are available to customize the output (e.g., to
>> export statements but not references, or to export only texts in English
>> and Wolof). The file creation takes a few (about three) hours on my machine
>> depending on what exactly is exported.
>>
>> For your convenience, I have created some example exports based on
>> yesterday's dumps. These can be found at [2]. There are three Turtle files:
>> site links only, labels/descriptions/aliases only, statements only. The
>> fourth file is a preliminary version of the Wikibase ontology that is used
>> in the exports.
>>
>> The export format is based on our earlier proposal [3], but it adds a lot
>> of details that had not been specified there yet (namespaces, references,
>> ID generation, compound datavalue encoding, etc.). Details might still
>> change, of course. We might provide regular dumps at another location once
>> the format is stable.
>>
>> As a side effect of these activities, the wda toolkit [1] is also getting
>> more convenient to use. Creating code for exporting the data into other
>> formats is quite easy.
>>
>> Features and known limitations of the wda RDF export:
>>
>> (1) All current Wikidata datatypes are supported. Commons-media data is
>> correctly exported as URLs (not as strings).
>>
>> (2) One-pass processing. Dumps are processed only once, even though this
>> means that we may not know the types of all properties when we first need
>> them: the script queries wikidata.org to find missing information. This
>> is only relevant when exporting statements.
>>
>> (3) Limited language support. The script uses Wikidata's internal
>> language codes for string literals in RDF. In some cases, this might not be
>> correct. It would be great if somebody could create a mapping from Wikidata
>> language codes to BCP47 language codes (let me know if you think you can do
>> this, and I'll tell you where to put it)
>>
>> (4) Limited site language support. To specify the language of linked wiki
>> sites, the script extracts a language code from the URL of the site. Again,
>> this might not be correct in all cases, and it would be great if somebody
>> had a proper mapping from Wikipedias/Wikivoyages to language codes.
>>
>> (5) Some data excluded. Data that cannot currently be edited is not
>> exported, even if it is found in the dumps. Examples include statement
>> ranks and timezones for time datavalues. I also currently exclude labels
>> and descriptions for simple English, formal German, and informal Dutch,
>> since these would pollute the label space for English, German, and Dutch
>> without adding much benefit (other than possibly for simple English
>> descriptions, I cannot see any case where these languages should ever have
>> different Wikidata texts at all).
>>
>> Feedback is welcome.
>>
>> Cheers,
>>
>> Markus
>>
>> [1] https://github.com/mkroetzsch/wda
>>     Run "python wda-export.data.py --help" for usage instructions
>> [2] http://semanticweb.org/RDF/Wikidata/
>> [3] http://meta.wikimedia.org/wiki/Wikidata/Development/RDF
>>
>> --
>> Markus Kroetzsch, Departmental Lecturer
>> Department of Computer Science, University of Oxford
>> Room 306, Parks Road, OX1 3QD Oxford, United Kingdom
>> +44 (0)1865 283529               http://korrekt.org/
>>
>> _______________________________________________
>> Wikidata-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>>
>> --
>> Kontokostas Dimitris
>>
>
>
>
> --
> -------------------------------------------------
> Hady El-Sahar
> Research Assistant
> Center of Informatics Sciences | Nile 
> University<http://nileuniversity.edu.eg/>
>
>
>


-- 
Kontokostas Dimitris
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to