Hi Piero,

  This seems to be a very common problem for the Italian dataset(s). I've done
some random tests, and more than 50% of abstracts are messed (this is very
random test so the numbers are only indicative).

   Just for example:

    http://dbpedia.org/page/Vasco_Rossi
    http://dbpedia.org/page/Gina_Lollobrigida
    http://dbpedia.org/page/Tom_Cruise

   and many others. Another problem 'Ornella Muti' is missed.

   As you can see, the problem is only for Italian, other language abstracts are ok.

Please let me know. I can help to fix that.

Alessio
  
-------- Original Message --------
Subject: Re: [Dbpedia-discussion] Italian short / long abstract problem
From: Piero Molino <[email protected]>
Date: Thu, September 22, 2011 10:12 am
To: Sebastian Hellmann <[email protected]>
Cc: [email protected],
[email protected]

Hello Alessio,

the fat that the abstract starts from the second sentence is probably due to the fact that many first sentences are generated from the templates in italian wikipedia.
A clear example is the bio template for peolpe.
So please take a look to the original wikipedia page source and see if this is the problem. If not, please give us some examples of the messed abstracts.

Regards,
Piero Molino



Il giorno 22/set/2011, alle ore 19:06, Sebastian Hellmann ha scritto:

Dear Alessio,
Sorry, but this is actually not on the top of our Todo list.
We could assist you  a little in fixing the problem.
Would you be willing to try it?
Sebastian

On 09/22/2011 02:02 PM, [email protected] wrote:
Hello,

   In DBPedia 3.7, a big amunt of long and short abstract for the Italian language are messed.
They start from the second sentence of the Wikipedia article, skipping the first one, so the
abstract as a whole is of a little use as the subject is often unclear.

   Is possible to fix that problem ?

Alessio
------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to