DanBri added a comment.
“Pollution” is a strong word that comes off as needlessly hostile. It seems
prudent and rational to get a broad sense of the landscape(and where it is
moving). The Wikidata data model is not trivially 1:1 with RDF/SPARQL and
there may be scope for hybrid solutions
DanBri added subscribers: Lydia_Pintscher, DanBri.
DanBri added a comment.
@Lydia_Pintscher mentioned a conversation with the Data Commons team at
Google, they have this opensource codebase that's somewhat in this area:
https://github.com/datacommonsorg/mixer
TASK DETAIL
DanBri added a comment.
How about generating sitemap files during munging (rather than as part of
mediawiki or wikibase frontend)?
TASK DETAIL
https://phabricator.wikimedia.org/T273113
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: DanBri
Cc
DanBri added a comment.
See also discussion on Twitter; various other search systems also don’t find
all Q-pages.
https://twitter.com/danbri/status/1362640655399473155?s=21
TASK DETAIL
https://phabricator.wikimedia.org/T273113
EMAIL PREFERENCES
https://phabricator.wikimedia.org
DanBri added a comment.
yes, front page sounds good
TASK DETAIL
https://phabricator.wikimedia.org/T236665
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: DanBri
Cc: Lydia_Pintscher, DanBri, johanricher, Aklapper, Tarrow, darthmon_wmde,
DannyS712
DanBri added a comment.
Yes that minimal example is the minimum used for Google Dataset Search to
have enough info to include a dataset in its collection.
https://twitter.com/chrisgorgo/status/1188129727514468352?s=19
TASK DETAIL
https://phabricator.wikimedia.org/T236665
EMAIL
DanBri added a comment.
From the quickest of looks (and totally unofficial w.r.t. my employer, of
course): I don't see the string "entity" in https://www.wikidata.org/robots.txt
so it may not be excluded there
TASK DETAIL
https://phabricator.wikimedia.org/T227246
EM
DanBri added a comment.
from a quick look, I wonder whether http://schema.org/about would be a good fit. In most cases you're trying to characterize the topic of the page, aren't you?
You could mark it up several ways but the basic idea would be that the 'about' property rela