Hey all, thanks for sharing the paper, this is an interesting topic. I just wanted to point to some (own) prior work on entity summarization which is related to what you have done: https://link.springer.com/chapter/10.1007/978-3-642-35173-0_24
All the best Magnus > Am 08.03.2018 um 19:00 schrieb Aidan Hogan <aid...@gmail.com>: > > Hey Raphaël, > > Thanks for the comments and the reference! And sorry we missed discussion of > your paper (which indeed looks at largely the same problem in a slightly > different context). If there's a next time, we will be sure to include it in > the related work. > > I am impressed btw to see a third-party evaluation of a Google tool. Also it > seems Google has room for improvement. :) > > Cheers, > Aidan > > On 07-03-2018 13:43, Raphaël Troncy wrote: >> Hey Aidan, >> Great work, I loved it! You may want to (cite and) look at what we did 4 >> years ago where we tried to reverse engineer a bit what Google is doing when >> choosing properties (and values) to show in its rich panels alongside >> popular entities. >> The paper is entitled "What Are the Important Properties >> of an Entity? Comparing Users and Knowledge Graph Point of View", >> https://www.eurecom.fr/~troncy/Publications/Assaf_Troncy-eswc14.pdf >> ... and the code is on github to replicate: https://github.com/ahmadassaf/KBE >> Raphaël >> Le 07/03/2018 à 05:53, Aidan Hogan a écrit : >>> Hi all, >>> >>> Tomás and I would like to share a paper that might be of interest to the >>> community. It presents some preliminary results of a work looking at fully >>> automated methods to generate Wikipedia info-boxes from Wikidata. The main >>> focus is on deciding what information from Wikidata to include, and in what >>> order. The results are based on asking users (students) to rate some >>> prototypes of generated info-boxes. >>> >>> Tomás Sáez, Aidan Hogan "Automatically Generating Wikipedia Infoboxes from >>> Wikidata". In the Proceedings of the Wiki Workshop at WWW 2018, Lyon, >>> France, April 24, 2018. >>> >>> - Link: http://aidanhogan.com/docs/infobox-wikidata.pdf >>> >>> We understand that populating info-boxes is an important goal of Wikidata >>> and hence we thought we'd share some lessons learned. >>> >>> Obviously a lot of work is being put into populating info-boxes from >>> Wikidata, but the main methods at the moment seem to be template-based and >>> require a lot of manual labour; plus the definition of these templates >>> seems to be a difficult problem for classes such as person (where different >>> information will have different priorities for people of different >>> professions, notoriety, etc.). >>> >>> We were just interested to see how far we could get with a fully automated >>> approach using some generic ranking methods. Also we thought that something >>> like this could perhaps be used to generate a "default" info-box for >>> articles with no info-box and no associated template mapping. The paper >>> presents preliminary results along those lines. >>> >>> One interesting result is that a major factor in the evaluation of the >>> generated info-boxes was the importance of the value. For example, Barack >>> Obama has lots of awards, but perhaps only something like the Nobel Peace >>> Prize might be of relevance to show in the info-box (<- being intended as >>> an illustrative example rather than a concrete assertion of course!). >>> Another example is that sibling might not be an important attribute in a >>> lot of cases, but when that sibling is Barack Obama, then that deserves to >>> be in the info-box (<- how such cases could be expressed in a purely >>> template-based approach, we are not sure, but it would seem difficult). >>> >>> We assess the importance of values with PageRank. Assessing the importance >>> not only of attributes, but of values, turned out to be a major influence >>> on how highly our evaluators assessed the quality of the generated >>> info-boxes. >>> >>> This initial/isolated observation might be interesting since, to the best >>> of our understanding, the current wisdom on populating info-boxes from >>> Wikidata focuses on what attributes to present and in which order, but does >>> not consider the importance of values (aside from the Wikidata rank >>> feature, which we believe is more intended to assess relevance/timeliness, >>> than importance). >>> >>> Hence one of the most interesting (and surprising, for us at least) results >>> of the work is to suggest that it appears to be important to rank *values* >>> by importance (not just attributes) when considering what information the >>> user might be interested in. >>> >>> (There are limitations to PageRank measures, however, in that they cannot >>> assess, for example, the importance of a particular date, or, more >>> generally, datatype values.) >>> >>> In any case, we are looking forward to presenting these results at the Wiki >>> Workshop at WWW 2018, and any feedback or thoughts are welcome! >>> >>> Cheers, >>> Aidan >>> >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- Magnus Knuth Hasso-Plattner-Institut für Digital Engineering gGmbH Prof.-Dr.-Helmert-Str. 2-3 14482 Potsdam Amtsgericht Potsdam, HRB 12184 Geschäftsführung: Prof. Dr. Christoph Meinel tel: +49 331 5509 547 email: magnus.kn...@hpi.de web: http://www.hpi.de/ webID: http://magnus.13mm.de/ _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata