Hi Cristina, Happy to see you here :) Just to add on top of Jaime's answer, here you have an example for python-based app <https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Flask_OAuth_tool> in Toolforge.
Hope this helps, Best, Diego On Fri, Sep 17, 2021 at 3:12 PM Jaime Crespo <[email protected]> wrote: > On Fri, Sep 17, 2021 at 3:03 PM Cristina Gava via Analytics < > [email protected]> wrote: > >> Hi Jaime, >> >> Thank you so much for the thorough reply :) All the references are super >> useful and I'll go through them now. I'll start with Toolforge, since it >> seems there is consensus on it being the most appropriate tool, and leave >> the dumps for later if needed. >> I'll keep you posted. >> > > It will depend a lot on the type of research needed. For example, ( to be > the devil's advocate, with a simple example) if you wanted to count the > total number of words written in Wikipedia and observe its frequency- > (meaning reading all edits in history), dumps would be a way better option > in this case, as wikireplicas only have access to medatada, not the actual > data. On top of that, reading sequentially all edits will be much faster > from a downloaded bundle, while on the live MariaDB database the access is > faster for small portions with specific conditions or small to medium > ranges. > > I think starting with wikireplicas and later going for the dumps if you > see it not working for you is a totally reasonable decision, in general, as > it will require less investment on your local setup. > > -- > Jaime Crespo > <http://wikimedia.org> > _______________________________________________ > Analytics mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ Analytics mailing list -- [email protected] To unsubscribe send an email to [email protected]
