Hi Andrea, and Kavi,

first of all thank you all for your feedback.

On 10/10/2012 01:25 PM, AboutThisDay wrote:
+1 on this request. The DBpedia live updates is a very powerful, unique and extremely useful functionality but I have also noticed some similar inconsistencies.

Cheers,

Kavi

On Wed, Oct 10, 2012 at 11:43 AM, Andrea Giacomini <[email protected] <mailto:[email protected]>> wrote:

    Hi all,

    I'm working on my master thesis and my work concerns to understand
    the syncronization process between Wikipedia and DBpedia
    live-uptades (changesets). In the following I describe some of the
    problems I came across and I would like to have an answer:

    First, according to the changes made in Wikipedia and the ones
    reported in DBpedia, I cannot identify a corrispondence one to
    one. In other words, I found that there are a lot of added and
    removed triples for a resource in DBpedia than the changes of the
    same resource shown in Wikipedia history page. How does it come? I
    was expectingthat a change in change in the Wikipedia infobox of
    an article is mapped in DBpedia as an added/removed triple for the
    same article/resource.


You are right in that, and we will fix that issue and get back to you again.



    Second, based on the structure of live-updates of DBpedia there is
    an incompatibility between a folder and its correspondent zip
    folder, e.g., if we consider zip folder 2012-09-01.tar.gz and the
    folder 2012-09-01, we find that there are triples that are present
    in the former folder and missing in the latter one. Is it caused
    because the system is down sometimes? In case of a positive
    answer, which folder should we take in consideration for our analysis?


Actually you should consider the folder, as the zip file just compresses the folder so that anyone can download all the updates at once. So if you open "2012-09-01.tar.gz", you will find files called "2012-09-01-00.tar.gz", "2012-09-01-01.tar.gz", and so on, those files contains the same contents as their corresponding folder. Anyway you don't have to worry about that as the our sync-tool, which is available at http://sourceforge.net/projects/dbpintegrator/files/, takes care of all these details.


    Last point but not less relevant regards to the last modified
    field associated with the added/removed file. I want to understand
    if the last modified value corresponds either to the effective
    time of the change carried out in a DBpedia resource or to the
    uploading time of the added/removed file in the changeset?


The file called "lastPublishedFile.txt" is the one the sync-tool uses to know if there are more files available, and if not it keeps waiting till more files are available.



    Best regards,
    Andrea Giacomini


    
------------------------------------------------------------------------------
    Don't let slow site performance ruin your business. Deploy New
    Relic APM
    Deploy New Relic app performance management and know exactly
    what is happening inside your Ruby, Python, PHP, Java, and .NET app
    Try New Relic at no cost today and get our sweet Data Nerd shirt too!
    http://p.sf.net/sfu/newrelic-dev2dev
    _______________________________________________
    Dbpedia-discussion mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev


_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to