Re: [Wikitech-l] How to get number of articles or related metrics for indic wiki's ?

2020-05-08 Thread Bryan Davis
On Fri, May 8, 2020 at 2:03 PM Shrinivasan T  wrote:
>
> Hello all,
>
>  I am planning to build a grafana dashboard using Prometheus for the counts
> of all indic wiki articles.

The dashboards you are thinking of may actually already exist.
Discovery of tools is a real problem in our community, and one I hope
to be able to work on more in the coming months.

Please take a look at

as an example of the data that the Wikimedia Foundation's Analytics
team publishes to help folks keep track of trends across the Wikimedia
movement's project wikis. More information on the "Wikistats 2"
project can be found at

including information on how you can contribute to this project.

> Have to get all the counts and write a custom exporter.
>
>
> Planning for a dashboard showing counts for articles in all indic languages.
>
> Another dashboard to show counts for all wiki projects for selected
> language.
>
> Have few queries.
>
> 1. How to get the number of articles in a wiki for example, tamil wikipedia
> ?  Any api is there to get numbers?

Basic information on article counts can be fetched from each wiki
using the Action API's action=query=siteinfo endpoint. See
 for more information
about this API.

See 

for an example usage on tawiki.

The Wikistats 2 project actually pulls its data from a public API as
well! The dashboard I linked above fetches data from
.
This is part of what is known as the "Wikimedia REST API". See
 for more
information on this API collection.


> 2. Can we run a sparql query from our own server?
>
> 3. Once these dashboards are built, can we host custom exporter, Prometheus
> and grafana in tool server or any wiki cloud server? Whom to contact for
> hosting these ?

Toolforge is probably not a great place to host a Prometheus server
simply because the local disk that you would have available to store
the data sets would be hosted on the shared NFS server which provides
$HOME directories for Toolforge maintainers and their tools.

A Cloud VPS project would be capable of hosting the general software
described. See 
for more information about what a Cloud VPS project is and how you
might apply to create one for your project.

Please be aware that a request to create the project described in this
email would likely receive a response encouraging you to collaborate
with the Wikistats 2 project to achieve your goals rather than making
a new project.

> Will do these in remote Hackathon this weekend.

I hope my answers here don't spoil your hackathon! Maybe try playing
around with Wikistats 2 and the APIs it uses and think of ways that
you could either add new features to Wikistats 2 or make a tool that
uses data from the same APIs that would be helpful to the Indic
language community.


Bryan
-- 
Bryan Davis  Technical Engagement  Wikimedia Foundation
Principal Software Engineer   Boise, ID USA
[[m:User:BDavis_(WMF)]]  irc: bd808

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Abstract Schema and Schema changes, request for help

2020-05-08 Thread Amir Sarabadani
Hello,
In case you haven't done any changes on database schema of mediawiki core,
let me explain the process to you (if you know this, feel free to skip this
paragraph):
* Mediawiki core supports three types of RDBMS: MySQL, Sqlite, Postgres. It
used to be five (plus Oracle and MSSQL)
* For each one of these types, you need to do three parts: 1- Change the
tables.sql file so new installations get the new schema 2- Make .sql schema
change file, like an "ALTER TABLE" for current installations so they can
upgrade. 3- Wire that schema change file into *Updater.php file.
* For example, this is a patch to drop a column:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/473601 This file touches
14 different files, adds 94 lines and removes 30.

This is bad for several reasons:
* It is extremely complicated to do a even a simple schema change. Usually
something as simple as adding an column takes a whole day for me. There are
lots of complicating factors, like Sqlite doesn't have ALTER TABLE, so when
you want to make a patch for adding a column, you need to make a temporary
table with the new column, copy the old table data to it, drop the old
table and then rename the old table.
** Imagine the pain and sorrow when you want to normalize a table meaning
you need to do several schema changes: 1- Add a table, 2- Add a column on
the old table, 3- make the column not-nullable when it's filled and make
the old column nullable instead 4- drop the old column.
* It's almost impossible to test all DBMS types, I don't have MSSQL or
Oracle installed and I don't even know their differences with MySQL. I
assume most other developers are good in one type, not all.
* Writing raw sqls, specially duplicated ones, and doubly specially when we
don't have CI to test (because we won't install propriety software in our
infra) is pretty much prone to error. My favourite one was that a new
column on a table was actually added to the wrong table in MSSQL and it
went unnoticed for two years (four releases, including one LTS).
* It's impossible to support more DBMS types through extensions or other
third party systems. Because the maintainer needs to keep up with all
patches we add to core and write their equivalents.
* For lots of reasons, these schemas are diverging, there have been several
work to just reduce this to a minimum.

There was a RFC to introduce abstract schema and schema changes and it got
accepted and I have been working to implement this:
https://phabricator.wikimedia.org/T191231

This is not a small task, and like any big work, it's important to cut it
to small pieces and gradually improve things. So my plan is first, I
abstract the schema (tables.sql files), then slowly I abstract schema
changes. For now, the plan is to make these .sql files automatically
generated through maintenance scripts. So we will have a file called
tables.json and when running something like:
php maintenance/generateSchemaSql.php --json maintenance/tables.json --sql
maintenance/tables-generated.sql --type=mysql
It would produce tables-generated.sql file. The code that produces it is
Doctrine DBAL and this is already installed as a dev dependency of core
because you would need Doctrine if you want to make a schema change, if you
maintain an instance, you should not need anything. Most of the work for
automatically generating schema is already merged and the last part that
wires it (and migrates two tables) is up for review:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240

My request is that I need to make lots of patches and since I'm doing this
in my volunteer capacity, I need developers to review (and potentially help
with the work if you're excited about this like me). Let me know if you're
willing to be added in future patches and the current patch also welcomes
any feedback: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240

I have added the documentation in
https://www.mediawiki.org/wiki/Manual:Schema_changes for the plan and
future changes. The ideal goal is that when you want to do a schema change,
you just change tables.json and create a json file that is snapshot of
before and after table (remember, sqlite doesn't have alter table, meaning
it has to know the whole table). Also, once we are in a good shape in
migrating mediawiki core, we can start cleaning up extensions.

Any feedback is also welcome.

Best
-- 
Amir (he/him)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Maps on wikidata

2020-05-08 Thread Michael Holloway
Hi Strainu,

It's probably best if a Wikibase dev confirms, but I think this is what
you're looking for:
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Wikibase/+/master/lib/includes/Formatters/CachingKartographerEmbeddingHandler.php#198

-mdh

On Fri, May 8, 2020 at 12:44 PM Strainu  wrote:

> Hey folks,
>
> Can someone point me to the code that decides which zoom to use for
> the maps that are displayed in the items with coordinates?
>
> Thanks,
>Strainu
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] How to get number of articles or related metrics for indic wiki's ?

2020-05-08 Thread Shrinivasan T
Hello all,

 I am planning to build a grafana dashboard using Prometheus for the counts
of all indic wiki articles.

Have to get all the counts and write a custom exporter.


Planning for a dashboard showing counts for articles in all indic languages.

Another dashboard to show counts for all wiki projects for selected
language.

Have few queries.

1. How to get the number of articles in a wiki for example, tamil wikipedia
?  Any api is there to get numbers?

2. Can we run a sparql query from our own server?

3. Once these dashboards are built, can we host custom exporter, Prometheus
and grafana in tool server or any wiki cloud server? Whom to contact for
hosting these ?

Will do these in remote Hackathon this weekend.

Shrini
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Maps on wikidata

2020-05-08 Thread Strainu
Hey folks,

Can someone point me to the code that decides which zoom to use for
the maps that are displayed in the items with coordinates?

Thanks,
   Strainu

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l