Sorry for a very late answer on this. Will try to start with what I can 
answer off-hand, but please go on asking if things remain unclear!

On 02/18/2011 10:29 PM, Benedikt Kaempgen wrote:
> Thanks for your answers.
>
> I like RDFIO's behavior of just importing the RDF, without adding SMW
> specific elements (e.g. categories, redirects etc.). My goal is it to import
> any RDF (RDF(S), OWL) and then run through it with a bot to set up the look
> and feel.
>
> However, I see some problems with RDFIO:
>
> - How are the property types of data type properties set? Should this not be
> done generic, e.g. as specific as possible, as general as needed?

Could you please examplify what you mean? (Things are fading in memory 
so I'm not sure I follow completely ...)

> - As I see it, SMW RDF export and RDFIO SPARQL endpoint give different
> outputs. Should they not provide the same information?

They should, *unless* using some of the "converter" options which can be 
used in the SPARQL form, which can convert URI:s to their "Original 
URIs" based on "Original URI" facts in the articles.

This might not be the best solution conceivable, but seemed like one 
which could easily be implemented to solve the need of providing (on 
request) a consistent vocabulary on import and export (i.e., using the 
same URI on export, that was used to create an article on import), while 
not changing the way SMW names its internal URI:s (with the 
"URIResolver" stuff), a change which seemed to go beyond the GSoC project.

Did this explain the differences you observed?

> - What happens if imported information already is available in the wiki? Do
> you really think it should be duplicated? Also, I am not quite sure, whether
> RDFIO properly checks if pages already represent URIs and can be just
> updated by the imported RDF.

If the information is available, I think SMWWriter should take care of 
not duplicating any fact. (I'll have to dig into the code again soon, to 
be able to answer for sure which measures RDFIO does/does not take to 
avoid unnecessary updates)

> - Apparently, it is not that robust: When I run RDFIO it sometimes gave me
> an error because of inappropriate wiki titles. Also for bigger RDF it
> reached max execution time and quit (although I set a higher one in
> php.ini).

This is indeed true. I have ran into problems with certain RDF 
construct, where it seems that the ARC library could not properly figure 
out the triple serialization of some RDF constructs ... (I really 
haven't nailed down the cause of the problem, so I can't blame anyone, 
but it just turns out that RDFIO does not get proper triples from ARC 
for certain constructs ...).

Also the timeouts is a general problem, since there is really heavey 
stuff going on (at least 3 or so complete page edits per triple, due to 
the addition of "Original URI" and "Equivalent URI" facts etc.). Space 
for improvement, I suppose.

Generally, I feel that RDFIO really would be helped by some thorough 
refactoring. If I get the time and strength one of these days, I might 
give this a go (I'll have to do something, at least before, or in the 
summer, due to other projects I'm involved in, which needs RDFIO), but 
since this is a bit into the future, feel free to go on make patches or 
commits (and feel free to ping me by email to keep me updated on the 
thinking behind them :))!

Hope this info helps, and hope to pay a bit more attentionn to RDFIO soon!

Best
// Samuel


> Best,
>
> Benedikt
>
> --
> Karlsruhe Institute of Technology (KIT)
> Institute of Applied Informatics and Formal Description Methods (AIFB)
>
> Benedikt Kämpgen
> Research Associate
>
> Kaiserstraße 12
> Building 11.40
> 76131 Karlsruhe, Germany
>
> Phone: +49 721 608-47946 (!new since 1 January 2011!)
> Fax: +49 721 608-46580 (!new since 1 January 2011!)
> Email: benedikt.kaemp...@kit.edu
> Web: http://www.kit.edu/
>
> KIT - University of the State of Baden-Wuerttemberg and
> National Research Center of the Helmholtz Association
>
>
> -----Original Message-----
> From: Markus Krötzsch [mailto:mar...@semantic-mediawiki.org]
> Sent: Wednesday, February 09, 2011 1:11 PM
> To: Samuel Lampa
> Cc: semediawiki-devel@lists.sourceforge.net>>  SMW developer list
> Subject: Re: [SMW-devel] RDF Import
>
> On 07/02/2011 21:33, Samuel Lampa wrote:
>> On 02/07/2011 10:20 PM, Samuel Lampa wrote:
>>> I'm sure far from everything is perfectly thought out in RDFIO, but
>>> maybe it can be a useful starting point work for similar things like
> this.
>>
>> Thinking aloud now: .... Maybe, as much as there has recently been
>> discussions about merging efforts to provide a more generalized/
>> integrated SPARQL/RDF store functionality [1], it might be a good time
>> to think also about a more generalized/integrated RDF/OWL import
>> functionality too?
>
> I agree, and the former is really a prerequisite of the latter. Ideally,
> SMW could thus obtain access to external Linked Data (cached locally for
> query answering). Ways of dynamically locating and pulling such data
> have already been explored in the SMW-based ShortiPedia prototype
> (shortipedia.org).
>
> In any case (to answer Benedikt's initial email), the old vocabulary
> import functionality and the equivalent URI method both affect the RDF
> export only, and do not contribute to the data as viewed in SMW (e.g.
> via queries). This whole aspect of SMW needs to be rethought.
>
> Markus
>
>
> ----------------------------------------------------------------------------
> --
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>
>
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
>
>
>
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


-- 
Samuel Lampa
---------------------------------------
  Bioinformatician @ Uppsala University
    Blog: http://saml.rilspace.org
---------------------------------------

------------------------------------------------------------------------------
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and publish 
your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to