Thanks for the info! Yes, I was mostly wondering about #1. Thanks for your
work!
On Sat, Sep 12, 2020 at 1:41 AM Tiziano Piccardi
wrote:
> Hi Denny, thanks for the questions!
>
> 1) The time unit is article revision (namespace 0). This means that in your
> example, the article would be
Hi Denny, thanks for the questions!
1) The time unit is article revision (namespace 0). This means that in your
example, the article would be available at T2 and T4. Adding the pages also
at T1 or T3 would mean to regenerate all the pages that include the
article, and the resulting dataset would
Three questions:
1) assume a page P with a Template T.
P has been modified at time T2 and T4.
T has been modified at T1 and T3.
Will P be available as of T2 and T4 only, or also as of T3? (at which point
it will be different than at T2 or T4).
2) What about changes to Wikidata, Commons, or UI
Thanks Federico and WSC for the interest!
I want to specify that we used only public data released in the XML dump.
As WSC said, deleted content is not always permanently removed from the
database, but it is available only to users with privilege access. Our goal
is not only to release the
I wouldn't use the phrase "Wikipedia’s deliberate policy of permanently
deleting the
entire history of deleted pages". Quite a few "deleted" pages do actually
get restored, and depending on the deletion process it can be quite easy to
get much deleted content back. Especially if someone volunteers
Thanks Federico.
I'm cc'ing Tiziano, who has been leading this project and can chime in.
All the best,
Bob
On Fri, Sep 11, 2020 at 11:22 AM Federico Leva (Nemo)
wrote:
> Robert West, 11/09/20 11:29:
> > local instances of MediaWiki,
> > enhanced with the capacity of correct historical macro
Robert West, 11/09/20 11:29:
> local instances of MediaWiki,
> enhanced with the capacity of correct historical macro expansion.
Interesting. I see this doesn't include deleted templates. Have you
considered using historical dumps?
«We emphasize that the limitation of deleted pages, tem- plates,
Hi all,
*TL;DR:*
So far, Wikipedia's full revision history has been available only in wiki
markup, not in HTML -- a big limitation for researchers. We are changing
this by releasing WikiHist.html, Wikipedia's full history (up until March
2019) in HTML:
https://zenodo.org/record/3605388