A problem with the category hierarchy is that any rather out of place 
subcategory brings in a full branch of anomalous subjects below it. 

 

Thus making a report like 
https://stats.wikimedia.org/wikimedia/pageviews/categorized/wp-en/2015-06/pageviews_wp-en_cat_WikiProject_Medicine_2015-06.html
 involved cyclic pruning of weird subbranches, manually building a blacklist of 
nodes not to follow, that I could feed into the script.

 

The other project, I gave up 
https://stats.wikimedia.org/EN/CategoryOverviewIndex.htm as it became too 
unwieldy. Also, sometimes a category re-appeared as a great-grandchild of 
itself. I had to detect that, in order to avoid loops. 

 

Those are two of the pitfalls.

 

As for navigation: Even a smallish ('concise') example of how my second project 
https://stats.wikimedia.org/EN/CategoryOverview_EN_Concise.htm makes it 
daunting. What I often see is a dynamic navigator that shows one level up or 
down, which makes me feel I'm in a maze.

 

Erik

From: Analytics [mailto:[email protected]] On Behalf Of Dan 
Andreescu
Sent: Monday, November 27, 2017 16:14
To: A mailing list for the Analytics Team at WMF and everybody who has an 
interest in Wikipedia and analytics. <[email protected]>
Subject: Re: [Analytics] Tool to visualize which wiki pages link to which wiki 
pages?

 

Hi Andre.  Jaime's query is a good starting point, it would get you the data 
you need for one wiki.  We can import the templatelinks table and then we can 
run it on Hadoop and get all wikis at once (we already have the other tables).

 

But once we got that, we'd have a graph with millions of nodes and edges.  
That's not possible to consume in visual form, so you could serve slices of the 
data and visualize parts of the graph.  The question is, then, what purpose 
would this visualization have?  If that's well defined, maybe we can figure out 
what slices of the data would be most useful.

 

On Tue, Nov 21, 2017 at 2:04 PM, Joseph Allemandou <[email protected] 
<mailto:[email protected]> > wrote:

Hi Andre,

I'm not aware of any tool as you describe.

I however think it would be super useful !

I'll think a bout it some more and possibly draft a ticket.

Cheers

Joseph

 

On Tue, Nov 21, 2017 at 4:29 PM, Federico Leva (Nemo) <[email protected] 
<mailto:[email protected]> > wrote:

Andre Klapper, 21/11/2017 17:15:

I've been wondering if anyone's aware of any visualization tool that
draws a graph showing which wiki pages are linked from which other wiki
pages (up to a certain depth)


The closest thing I can think of is Erik's chart of category links, generated 
with a script which is published somewhere and could be adapted at least for 
simple regex filters.
https://stats.wikimedia.org/EN/CategoryOverviewIndex.htm
https://stats.wikimedia.org/wikimedia/pageviews/categorized/

There's also 
<http://www.chrisharrison.net/index.php/Visualizations/ClusterBall> and a graph 
of links between user pages, which was made perhaps in 2014.

Federico



_______________________________________________
Analytics mailing list
[email protected] <mailto:[email protected]> 
https://lists.wikimedia.org/mailman/listinfo/analytics





 

-- 

Joseph Allemandou

Data Engineer @ Wikimedia Foundation

IRC: joal


_______________________________________________
Analytics mailing list
[email protected] <mailto:[email protected]> 
https://lists.wikimedia.org/mailman/listinfo/analytics

 

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to