Hi Andrea, and Kavi,
first of all thank you all for your feedback.
On 10/10/2012 01:25 PM, AboutThisDay wrote:
+1 on this request. The DBpedia live updates is a very powerful,
unique and extremely useful functionality but I have also noticed some
similar inconsistencies.
Cheers,
Kavi
On Wed, Oct 10, 2012 at 11:43 AM, Andrea Giacomini
<[email protected] <mailto:[email protected]>> wrote:
Hi all,
I'm working on my master thesis and my work concerns to understand
the syncronization process between Wikipedia and DBpedia
live-uptades (changesets). In the following I describe some of the
problems I came across and I would like to have an answer:
First, according to the changes made in Wikipedia and the ones
reported in DBpedia, I cannot identify a corrispondence one to
one. In other words, I found that there are a lot of added and
removed triples for a resource in DBpedia than the changes of the
same resource shown in Wikipedia history page. How does it come? I
was expectingthat a change in change in the Wikipedia infobox of
an article is mapped in DBpedia as an added/removed triple for the
same article/resource.
You are right in that, and we will fix that issue and get back to you again.
Second, based on the structure of live-updates of DBpedia there is
an incompatibility between a folder and its correspondent zip
folder, e.g., if we consider zip folder 2012-09-01.tar.gz and the
folder 2012-09-01, we find that there are triples that are present
in the former folder and missing in the latter one. Is it caused
because the system is down sometimes? In case of a positive
answer, which folder should we take in consideration for our analysis?
Actually you should consider the folder, as the zip file just compresses
the folder so that anyone can download all the updates at once.
So if you open "2012-09-01.tar.gz", you will find files called
"2012-09-01-00.tar.gz", "2012-09-01-01.tar.gz", and so on, those files
contains the same contents as their corresponding folder.
Anyway you don't have to worry about that as the our sync-tool, which is
available at http://sourceforge.net/projects/dbpintegrator/files/, takes
care of all these details.
Last point but not less relevant regards to the last modified
field associated with the added/removed file. I want to understand
if the last modified value corresponds either to the effective
time of the change carried out in a DBpedia resource or to the
uploading time of the added/removed file in the changeset?
The file called "lastPublishedFile.txt" is the one the sync-tool uses to
know if there are more files available, and if not it keeps waiting till
more files are available.
Best regards,
Andrea Giacomini
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New
Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion