Re: [Wikitech-l] Scope problem with scribunto happening when invoking but not in debug console

2013-08-22 Thread Mathieu Stumpf

Le 2013-08-21 22:54, Brad Jorsch (Anomie) a écrit :

On Wed, Aug 21, 2013 at 4:13 PM, Mathieu Stumpf 
psychosl...@culture-libre.org wrote:

But no, it doesn't, I still generate a randomly ordered wiki table. 
So,

what did I missed?



Two things:
1. table.sort() only sorts the sequence part of the table. So only 
the

elements 1-9 in your table, not 0 or A-E.
2. pairs() doesn't process the elements in any particular order; you 
can
see this by printing out the keys as you iterate through using 
pairs().
You'd need to use ipairs instead (which again only does the sequence 
part

of the table).


So as I understand it, I will have to create a new table indexed with 
integers,
using pairs, then sort this itable, before using ipairs on it. It 
looks to me
like something which may be of real common use, so maybe integrating a 
function
in the mw.* library may be interesting. On the other hand may be I'm 
just not
acustomed with the idiomatic way to build lua code and one may suggest 
me to

architecture my code in an other way.

--
Association Culture-Libre
http://www.culture-libre.org/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HTTPS for logged in users delayed. New date: August 28

2013-08-22 Thread Risker
On 21 August 2013 13:56, Rob Lanphier ro...@wikimedia.org wrote:

 Hi everyone,

 After assessing the current readiness (or lack thereof) of our HTTPS
 code, we've decided to postpone the deployment for a week.  We have a
 number of things that we'd like to get cleaner resolution on:

 *  Use of GeoIP vs enabling on per wiki basis
 *  Use of a preference vs login form checkbox vs hidden option vs
 sensible default
 *  How interactions with login.wikimedia.org will work
 *  Validation of our HTTPS test methodology

 The new plan is to deploy on Wednesday, August 28 between 20:00 UTC
 and 23:00 UTC.  Prior to that, we plan on having a very limited
 deployment to our test wikis, and we're also planning to deploy to
 mediawiki.org.  Assuming this is sorted out and we have made our test
 deployments by end of day Monday, August 26, we should have time to
 validate our assumptions and give people time to see the new system in
 action.

 More info is (or will be) available here:
 https://meta.wikimedia.org/wiki/HTTPS
 (or here if you prefer: http://meta.wikimedia.org/wiki/HTTPS )

 Thanks everyone for your patience.

 Rob


Thanks for keeping everyone updated, Rob - and thanks to those who have
been working on this.

Risker
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikimedia's anti-surveillance plans: site hardening

2013-08-22 Thread Faidon Liambotis

On Sat, Aug 17, 2013 at 05:55:36PM -0400, Sumana Harihareswara wrote:

I suggest that we also update either
https://meta.wikimedia.org/wiki/HTTPS or a hub page on
http://wikitech.wikimedia.org/ or
https://www.mediawiki.org/wiki/Security_auditing_and_response with
up-to-date plans, to make it easier for experts inside and outside the
Wikimedia community to get up to speed and contribute.  For topics under
internal discussion and investigation, I would love a simple bullet
point saying: we're thinking about that, sorry nothing public or
concrete yet, contact $person if you have experience to share.


This is a good suggestion. We had a pad that we've been working on even 
before this thread; a few of us (Ryan, Mark, Asher, Ken, myself) met the 
other day and worked a bit on our strategy from the operations 
perspective and put out our notes at:

https://wikitech.wikimedia.org/wiki/HTTPS/Future_work

It's still very rudimentary bullet-point summary so it might not be an 
easy read. Feel free to ask questions here or or on-wiki.


There's obviously still a lot of unknowns -- we have a lot of evaluate 
this TODO item. Feel provide feedback or audit our choices, though, 
it'd be very much welcome. If you feel you can help in some of these 
areas in some other ways, feel free to say so and we'll try to find a 
way to make it happen.


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] SMWCon Fall 2013: 2nd Call for Contributions. Registration.

2013-08-22 Thread Yury Katkov
Dear semantic wiki users and developers,

We are very happy to announce that early bird registration to the 8th
Semantic MediaWiki Conference is now open [2]!

Important facts reminder:
--
* Dates: October 28th to October 30th 2013 (Monday to Wednesday)
* Location: AO Berlin Hauptbahnhof, Lehrter Str. 12, 10557 Berlin, Germany
* Conference wiki page: https://semantic-mediawiki.org/wiki/SMWCon_Fall_2013
* Participants: Everybody interested in semantic wikis, especially in
Semantic MediaWiki, e.g. users, developers, consultants, business
representatives and researchers.

We welcome new contributions from you:
--
* We encourage contributions about applications and development of
semantic wikis; for a list of topics, see [1]
* Please propose regular talks, posters or super-short lightning talks
on the conference website. We will provide feedback to you and do our
best to consider your proposal in the conference program
* Tutorials and presentations will be video and audio recorded and
made available for others after the conference.
* If you've already announced your talk it's now time to expand its description

News on participation, tutorials and keynote:
--
* You can now officially register for the conference [2] and benefit
from early bird fees until September 14, 2013
* The tutorial program has been announced and available [3]. This year
we have two tutorial tracks:
  ** The beginner's tutorials are focused on business applications of
semantic wikis
  ** The developer's tutorials will help you to become Semantic
MediaWiki programmer: contributing to core and and writing extensions
for your specific needs
* Professor Yolanda Gil from the University of Southern California [4]
will give a keynote on scientific data curation

Organizers and sponsors
--
* Wikimedia Deutschland e. V. [5] has become the official organiser of
SMWCon Fall 2013
* Thanks to our sponsors WikiVote [6] (platinum) and ArchiXL [7]
(gold) fees for this SMWCon remain low

If you have questions you can contact Yury Katkov (Program Chair),
Benedikt Kämpgen (General Chair) or Karsten Hoffmeyer (Local Chair)
per e-mail (Cc).

We will be happy to see you in Berlin!

Yury Katkov, Program Chair

[1] http://semantic-mediawiki.org/wiki/SMWCon Fall 2013/Announcement
[2] http://de.amiando.com/PVADAOV.html
[3] http://semantic-mediawiki.org/wiki/SMWCon_Fall_2013#Program
[4] http://www.isi.edu/~gil/
[5] https://www.wikimedia.de/wiki/Hauptseite
[6] http://wikivote.ru/
[7] http://www.archixl.nl/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] RFC: LESS support in MediaWiki core

2013-08-22 Thread Krinkle
On Aug 20, 2013, at 2:31 AM, Tyler Romeo tylerro...@gmail.com wrote:

 As long as the change does not inhibit extensions from hooking in and using
 other CSS pre-processors, I don't see any issue with using LESS in core.
 


However if and when we adopt LESS support in core, which only happens if we
also incorporate it into our MediaWiki coding conventions, complying
extensions will, by convention, not be tolerated to use other
pre-processors.

However I agree that ResourceLoader in general should be agnostic and allow
implementation and usage of whatever you want in your own extensions.

From quickly looking at the draft patch set and the existing
extension[1][2] that already implements this I we can conclude that this is
already the case, and I'll hereby vouch for continued support of such
extensibility for other pre-processors as well. However core should only
(at most) include 1.

-- Krinkle

[1] https://www.mediawiki.org/wiki/Extension:Less
[2] https://github.com/wikimedia/mediawiki-extensions-Less

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Engaging users with the old plain newsletter

2013-08-22 Thread Quim Gil
I sent this to the Editor Engagement list but maybe here there is more 
people interested.


On 08/20/2013 06:39 AM, Quim Gil wrote:

Upon account creation or in your user profile: Send me important email
updates

You have seen this feature in many collaborative sites, but not in
Wikimedia sites. Unless I'm missing something, this common feature
doesn't seem to be available in MediaWiki core or through extensions.
Users can get notifications based on activity on wiki pages and they can
also get emails from another users, but is there a way for an admin to
send an update or a newsletter via email to all users that ticked the box?

While new users will be hardly used to watchlists or user talk pages,
and while it will be difficult for them to find the Wikimedia blog, The
Signpost, etc... all of them have email and a nice % would be happy to
receive emails from time to time with interesting information. Has this
option been discussed?

PS: as a mediawiki.org admin I wish we had something like this to engage
new and old contributors.


After a bit of investigation:

- Indeed, there seems to be no extension doing this.

- Translatewiki.net does have a newsletter and users can opt-in/out from 
their preferences, but according to Siebrand it is a very hackish/adhoc 
approach involving manual SQL queries and Mailman.


- According to Kaldari, the basic implementation as an extension should 
be pretty simple, not even requiring Echo. The building blocks are 
already provided by plain MediaWiki.


As I see it, this could have two levels of implementation:

1. Just one site-wide notification from admins that users can opt-in 
from their preferences.


2. Possibility for a determinate user group to add more topics (e.g. New 
release announcements, Activities for contributors...) and allow people 
to subscribe to them.


Do you see interesting for mediawiki.org and/or wikitech.wikimedia.org? 
For any other Wikimedia sites? For your own MediaWiki? I can't develop 
this but I would be happy to volunteer as PM  tester (here and/or in 
my personal pet project).


--
Quim Gil
Technical Contributor Coordinator @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] unexpected error info in HTML

2013-08-22 Thread Sumana Harihareswara
On 08/01/2013 03:08 AM, Jiang BIAN bianji...@google.com wrote:
 Hi,
 
 I noticed some pages we crawled containing error message like this;
 
 div id=mw-content-text lang=zh-CN dir=ltr class=mw-content-ltrp
 class=errorFailed to render property P373:
 Wikibase\LanguageWithConversion::factory: given languages do not have the
 same parent language/p
 
 
 But when I open the url in browser, there is no such message. And using
 index.php can also get normal content without error messages.
 
 Here are examples you can retry:
 
 bad
 $ wget 'http://zh.wikipedia.org/zh-cn/Google'
 
 good
 $ wget 'http://zh.wikipedia.org/w/index.php?title=Google'
 
 
 Looks like something is wrong on Wikipedia side, anything need to fix?
 
 
 
 Thanks

I checked with Jiang Bian and found out that this is still happening --
can anyone help Google out here? :-)

-- 
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] unexpected error info in HTML

2013-08-22 Thread Liangent
On Fri, Aug 23, 2013 at 7:06 AM, Sumana Harihareswara suma...@wikimedia.org
 wrote:

 On 08/01/2013 03:08 AM, Jiang BIAN bianji...@google.com wrote:
  Hi,
 
  I noticed some pages we crawled containing error message like this;
 
  div id=mw-content-text lang=zh-CN dir=ltr
 class=mw-content-ltrp
  class=errorFailed to render property P373:
  Wikibase\LanguageWithConversion::factory: given languages do not have the
  same parent language/p
 
 
  But when I open the url in browser, there is no such message. And using
  index.php can also get normal content without error messages.
 
  Here are examples you can retry:
 
  bad
  $ wget 'http://zh.wikipedia.org/zh-cn/Google'
 
  good
  $ wget 'http://zh.wikipedia.org/w/index.php?title=Google'
 
 
  Looks like something is wrong on Wikipedia side, anything need to fix?
 
 
 
  Thanks

 I checked with Jiang Bian and found out that this is still happening --
 can anyone help Google out here? :-)

 --
 Sumana Harihareswara
 Engineering Community Manager
 Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


There was a bug in some Wikibase version deployed in July which caused this
error, but a fix was backported soon and since then I've never seen any
similar error as a logged in user. If you still see some errors only when
unlogged in at particular URLs (like what you described) now, it's likely
that those URLs got cached in Squid when the bug was live... In this case
purging those pages[1] should be able to fix the issue.

[1] https://en.wikipedia.org/wiki/Wikipedia:Purge

-Liangent
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Fwd: Java-based Wiktionary Library (JWKTL) 1.0.0 released as open source software

2013-08-22 Thread Sumana Harihareswara
Forwarding to the Wikidata tech list in case this makes a future
Wiktionary collaboration easier.


 Original Message 
Subject: [Wiki-research-l] Java-based Wiktionary Library (JWKTL) 1.0.0
released as open source software (Wiki-research-l Digest, Vol 96, Issue 22)
Date: Tue, 20 Aug 2013 14:20:56 +
From: Judith Eckle-Kohler eckle-koh...@ukp.informatik.tu-darmstadt.de
Reply-To: wiki-researc...@lists.wikimedia.org
To: wiki-researc...@lists.wikimedia.org
wiki-researc...@lists.wikimedia.org


[Apologies for X-posting]


We are pleased to announce the release of the Java-based Wiktionary
Library (JWKTL) 1.0.0 - an application programming interface for Wiktionary.

Project homepage: http://code.google.com/p/jwktl/


== Overview ==

JWKTL (Java-based Wiktionary Library) is an application programming
interface for the free multilingual online dictionary Wiktionary
(http://www.wiktionary.org). JWKTL enables efficient and structured
access to the information encoded in the English, the German, and the
Russian Wiktionary language editions, including sense definitions, part
of speech tags, etymology, example sentences, translations, semantic
relations, and many other lexical information types. The Russian JWKTL
parser is based on Wikokit (http://code.google.com/p/wikokit/).

Prior to being available as open source software, JWKTL has been a
research project at the Ubiquitous Knowledge Processing (UKP) Lab of the
Technische Universität Darmstadt, Germany. The following people have
mainly contributed to this project: Yevgen Chebotar, Iryna Gurevych,
Christian M. Meyer, Christof Müller, Lizhen Qu, Torsten Zesch.


== Publications ==

A detailed description of Wiktionary and JWKTL is available in our
scientific articles:

* Christian M. Meyer and Iryna Gurevych: Wiktionary: A new rival for
expert-built lexicons? Exploring the possibilities of collaborative
lexicography, Chapter 13 in S. Granger  M. Paquot (Eds.): Electronic
Lexicography, pp. 259291, Oxford: Oxford University Press, November
2012.
(http://www.ukp.tu-darmstadt.de/publications/details/?no_cache=1tx_bibtex_pi1%5Bpub_id%5D=TUD-CS-2012-0008)
* Christian M. Meyer and Iryna Gurevych: OntoWiktionary  Constructing
an Ontology from the Collaborative Online Dictionary Wiktionary, chapter
6 in M. T. Pazienza and A. Stellato (Eds.): Semi-Automatic Ontology
Development: Processes and Resources, pp. 131161, Hershey, PA: IGI
Global, February 2012.
(http://www.ukp.tu-darmstadt.de/publications/details/?no_cache=1tx_bibtex_pi1%5Bpub_id%5D=TUD-CS-2011-0202)
* Torsten Zesch, Christof Müller, and Iryna Gurevych: Extracting Lexical
Semantic Knowledge from Wikipedia and Wiktionary, in: Proceedings of the
6th International Conference on Language Resources and Evaluation
(LREC), pp. 16461652, May 2008. Marrakech, Morocco.
(http://www.ukp.tu-darmstadt.de/publications/details/?no_cache=1tx_bibtex_pi1%5Bpub_id%5D=TUD-CS-2008-4)


== License and Availability ==

The latest version of JWKTL is available via Maven Central. If you use
Maven as your build tool, then you can add JWKTL as a dependency in your
pom.xml file:

dependency
   groupIdde.tudarmstadt.ukp.jwktl/groupId
   artifactIdjwktl/artifactId
   version1.0.0/version
/dependency

JWKTL is available as open source software under the Apache License 2.0
(ASL). The software thus comes as is without any warranty (see license
text for more details). JWKTL makes use of Berkeley DB Java Edition
5.0.73 (Sleepycat License), Apache Ant 1.7.1 (ASL), Xerces 2.9.1 (ASL),
JUnit 4.10 (CPL).

Some classes have been taken from the Wikokit project (available under
multiple licenses, redistributed under the ASL license). See NOTICE.txt
for further details.


== Contact ==

Please direct any questions or suggestions to

  https://groups.google.com/forum/#!forum/jwktl-users
  Group E-Mail: jwktl-us...@googlegroups.com


Best wishes,


Christian M. Meyer


--
Christian M. Meyer, M.Sc.
Doctoral Researcher
Ubiquitous Knowledge Processing (UKP Lab)
FB 20 Computer Science Department
Technische Universität Darmstadt
Hochschulstr. 10, D-64289 Darmstadt, Germany
Phone [+49] (0)6151 16-5386, fax -5455, room S2/02/B113
me...@ukp.informatik.tu-darmstadt.de
www.ukp.tu-darmstadt.dehttp://www.ukp.tu-darmstadt.de
Web Research at TU Darmstadt (WeRC)
www.werc.tu-darmstadt.dehttp://www.werc.tu-darmstadt.de
-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.wikimedia.org/pipermail/wiki-research-l/attachments/20130820/cb184002/attachment-0001.html




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] unexpected error info in HTML

2013-08-22 Thread Jiang BIAN
We are actually crawling the HTML via bot, so the bug is not actually fixed
for non-login user, right?

Could you share the bug's link?

Thanks


On Thu, Aug 22, 2013 at 4:38 PM, Liangent liang...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 7:06 AM, Sumana Harihareswara 
 suma...@wikimedia.org
  wrote:

  On 08/01/2013 03:08 AM, Jiang BIAN bianji...@google.com wrote:
   Hi,
  
   I noticed some pages we crawled containing error message like this;
  
   div id=mw-content-text lang=zh-CN dir=ltr
  class=mw-content-ltrp
   class=errorFailed to render property P373:
   Wikibase\LanguageWithConversion::factory: given languages do not have
 the
   same parent language/p
  
  
   But when I open the url in browser, there is no such message. And using
   index.php can also get normal content without error messages.
  
   Here are examples you can retry:
  
   bad
   $ wget 'http://zh.wikipedia.org/zh-cn/Google'
  
   good
   $ wget 'http://zh.wikipedia.org/w/index.php?title=Google'
  
  
   Looks like something is wrong on Wikipedia side, anything need to fix?
  
  
  
   Thanks
 
  I checked with Jiang Bian and found out that this is still happening --
  can anyone help Google out here? :-)
 
  --
  Sumana Harihareswara
  Engineering Community Manager
  Wikimedia Foundation
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 

 There was a bug in some Wikibase version deployed in July which caused this
 error, but a fix was backported soon and since then I've never seen any
 similar error as a logged in user. If you still see some errors only when
 unlogged in at particular URLs (like what you described) now, it's likely
 that those URLs got cached in Squid when the bug was live... In this case
 purging those pages[1] should be able to fix the issue.

 [1] https://en.wikipedia.org/wiki/Wikipedia:Purge

 -Liangent
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Jiang BIAN

This email may be confidential or privileged.  If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it went to
the wrong person.  Thanks.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] unexpected error info in HTML

2013-08-22 Thread Liangent
On Fri, Aug 23, 2013 at 8:13 AM, Jiang BIAN bianji...@google.com wrote:

 We are actually crawling the HTML via bot, so the bug is not actually fixed
 for non-login user, right?


I can't think of a good way to fix the problem from this aspect besides
waiting for old cached page to expire, unless some sysadmin is happy to
nuke all existing Squid cached pages.

However if you have a list of affected pages as you're crawling HTML, which
we don't have, you can simply purge them in batch and recrawl those pages.


 Could you share the bug's link?


There was no bug created in bugzilla... I submitted a patch[1] directly to
fix the bug once it was spotted.

[1] https://gerrit.wikimedia.org/r/#/c/76060/

-Liangent



 Thanks


 On Thu, Aug 22, 2013 at 4:38 PM, Liangent liang...@gmail.com wrote:

  On Fri, Aug 23, 2013 at 7:06 AM, Sumana Harihareswara 
  suma...@wikimedia.org
   wrote:
 
   On 08/01/2013 03:08 AM, Jiang BIAN bianji...@google.com wrote:
Hi,
   
I noticed some pages we crawled containing error message like this;
   
div id=mw-content-text lang=zh-CN dir=ltr
   class=mw-content-ltrp
class=errorFailed to render property P373:
Wikibase\LanguageWithConversion::factory: given languages do not have
  the
same parent language/p
   
   
But when I open the url in browser, there is no such message. And
 using
index.php can also get normal content without error messages.
   
Here are examples you can retry:
   
bad
$ wget 'http://zh.wikipedia.org/zh-cn/Google'
   
good
$ wget 'http://zh.wikipedia.org/w/index.php?title=Google'
   
   
Looks like something is wrong on Wikipedia side, anything need to
 fix?
   
   
   
Thanks
  
   I checked with Jiang Bian and found out that this is still happening --
   can anyone help Google out here? :-)
  
   --
   Sumana Harihareswara
   Engineering Community Manager
   Wikimedia Foundation
  
   ___
   Wikitech-l mailing list
   Wikitech-l@lists.wikimedia.org
   https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
 
  There was a bug in some Wikibase version deployed in July which caused
 this
  error, but a fix was backported soon and since then I've never seen any
  similar error as a logged in user. If you still see some errors only when
  unlogged in at particular URLs (like what you described) now, it's likely
  that those URLs got cached in Squid when the bug was live... In this case
  purging those pages[1] should be able to fix the issue.
 
  [1] https://en.wikipedia.org/wiki/Wikipedia:Purge
 
  -Liangent
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 



 --
 Jiang BIAN

 This email may be confidential or privileged.  If you received this
 communication by mistake, please don't forward it to anyone else, please
 erase all copies and attachments, and please let me know that it went to
 the wrong person.  Thanks.
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] unexpected error info in HTML

2013-08-22 Thread Jiang BIAN
Thanks for the link. But I think this is targeting the language variant
related fix.

We actually observed stale cache in a wider range, see the bug entry:
https://bugzilla.wikimedia.org/show_bug.cgi?id=46014


On Thu, Aug 22, 2013 at 5:26 PM, Liangent liang...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 8:13 AM, Jiang BIAN bianji...@google.com wrote:

  We are actually crawling the HTML via bot, so the bug is not actually
 fixed
  for non-login user, right?
 

 I can't think of a good way to fix the problem from this aspect besides
 waiting for old cached page to expire, unless some sysadmin is happy to
 nuke all existing Squid cached pages.

 However if you have a list of affected pages as you're crawling HTML, which
 we don't have, you can simply purge them in batch and recrawl those pages.


  Could you share the bug's link?
 

 There was no bug created in bugzilla... I submitted a patch[1] directly to
 fix the bug once it was spotted.

 [1] https://gerrit.wikimedia.org/r/#/c/76060/

 -Liangent


 
  Thanks
 
 
  On Thu, Aug 22, 2013 at 4:38 PM, Liangent liang...@gmail.com wrote:
 
   On Fri, Aug 23, 2013 at 7:06 AM, Sumana Harihareswara 
   suma...@wikimedia.org
wrote:
  
On 08/01/2013 03:08 AM, Jiang BIAN bianji...@google.com wrote:
 Hi,

 I noticed some pages we crawled containing error message like this;

 div id=mw-content-text lang=zh-CN dir=ltr
class=mw-content-ltrp
 class=errorFailed to render property P373:
 Wikibase\LanguageWithConversion::factory: given languages do not
 have
   the
 same parent language/p


 But when I open the url in browser, there is no such message. And
  using
 index.php can also get normal content without error messages.

 Here are examples you can retry:

 bad
 $ wget 'http://zh.wikipedia.org/zh-cn/Google'

 good
 $ wget 'http://zh.wikipedia.org/w/index.php?title=Google'


 Looks like something is wrong on Wikipedia side, anything need to
  fix?



 Thanks
   
I checked with Jiang Bian and found out that this is still happening
 --
can anyone help Google out here? :-)
   
--
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation
   
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   
  
   There was a bug in some Wikibase version deployed in July which caused
  this
   error, but a fix was backported soon and since then I've never seen any
   similar error as a logged in user. If you still see some errors only
 when
   unlogged in at particular URLs (like what you described) now, it's
 likely
   that those URLs got cached in Squid when the bug was live... In this
 case
   purging those pages[1] should be able to fix the issue.
  
   [1] https://en.wikipedia.org/wiki/Wikipedia:Purge
  
   -Liangent
   ___
   Wikitech-l mailing list
   Wikitech-l@lists.wikimedia.org
   https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
 
 
 
  --
  Jiang BIAN
 
  This email may be confidential or privileged.  If you received this
  communication by mistake, please don't forward it to anyone else, please
  erase all copies and attachments, and please let me know that it went to
  the wrong person.  Thanks.
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Jiang BIAN

This email may be confidential or privileged.  If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it went to
the wrong person.  Thanks.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] unexpected error info in HTML

2013-08-22 Thread Liangent
On Fri, Aug 23, 2013 at 8:33 AM, Jiang BIAN bianji...@google.com wrote:

 Thanks for the link. But I think this is targeting the language variant
 related fix.


This is the root cause of that behavior you mentioned. (It only happens /
happened on zhwiki and maybe as well as some wikis with variants, right?)

-Liangent



 We actually observed stale cache in a wider range, see the bug entry:
 https://bugzilla.wikimedia.org/show_bug.cgi?id=46014


 On Thu, Aug 22, 2013 at 5:26 PM, Liangent liang...@gmail.com wrote:

  On Fri, Aug 23, 2013 at 8:13 AM, Jiang BIAN bianji...@google.com
 wrote:
 
   We are actually crawling the HTML via bot, so the bug is not actually
  fixed
   for non-login user, right?
  
 
  I can't think of a good way to fix the problem from this aspect besides
  waiting for old cached page to expire, unless some sysadmin is happy to
  nuke all existing Squid cached pages.
 
  However if you have a list of affected pages as you're crawling HTML,
 which
  we don't have, you can simply purge them in batch and recrawl those
 pages.
 
 
   Could you share the bug's link?
  
 
  There was no bug created in bugzilla... I submitted a patch[1] directly
 to
  fix the bug once it was spotted.
 
  [1] https://gerrit.wikimedia.org/r/#/c/76060/
 
  -Liangent
 
 
  
   Thanks
  
  
   On Thu, Aug 22, 2013 at 4:38 PM, Liangent liang...@gmail.com wrote:
  
On Fri, Aug 23, 2013 at 7:06 AM, Sumana Harihareswara 
suma...@wikimedia.org
 wrote:
   
 On 08/01/2013 03:08 AM, Jiang BIAN bianji...@google.com wrote:
  Hi,
 
  I noticed some pages we crawled containing error message like
 this;
 
  div id=mw-content-text lang=zh-CN dir=ltr
 class=mw-content-ltrp
  class=errorFailed to render property P373:
  Wikibase\LanguageWithConversion::factory: given languages do not
  have
the
  same parent language/p
 
 
  But when I open the url in browser, there is no such message. And
   using
  index.php can also get normal content without error messages.
 
  Here are examples you can retry:
 
  bad
  $ wget 'http://zh.wikipedia.org/zh-cn/Google'
 
  good
  $ wget 'http://zh.wikipedia.org/w/index.php?title=Google'
 
 
  Looks like something is wrong on Wikipedia side, anything need to
   fix?
 
 
 
  Thanks

 I checked with Jiang Bian and found out that this is still
 happening
  --
 can anyone help Google out here? :-)

 --
 Sumana Harihareswara
 Engineering Community Manager
 Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

   
There was a bug in some Wikibase version deployed in July which
 caused
   this
error, but a fix was backported soon and since then I've never seen
 any
similar error as a logged in user. If you still see some errors only
  when
unlogged in at particular URLs (like what you described) now, it's
  likely
that those URLs got cached in Squid when the bug was live... In this
  case
purging those pages[1] should be able to fix the issue.
   
[1] https://en.wikipedia.org/wiki/Wikipedia:Purge
   
-Liangent
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   
  
  
  
   --
   Jiang BIAN
  
   This email may be confidential or privileged.  If you received this
   communication by mistake, please don't forward it to anyone else,
 please
   erase all copies and attachments, and please let me know that it went
 to
   the wrong person.  Thanks.
   ___
   Wikitech-l mailing list
   Wikitech-l@lists.wikimedia.org
   https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 



 --
 Jiang BIAN

 This email may be confidential or privileged.  If you received this
 communication by mistake, please don't forward it to anyone else, please
 erase all copies and attachments, and please let me know that it went to
 the wrong person.  Thanks.
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Weighted random article

2013-08-22 Thread Lars Aronsson

The Swedish Wikipedia now has more than 1.5 million
articles, compared to 600,000 in January 2013 and
500,000 in September 2012. This is due to the creation
by a bot of many articles on animal and plant species.

The Swedish Wikipedia community has discussed the
matter thoroughly, and there is strong consensus to
keep these articles and to keep on generating more.
(It is known that many German wikipedians think these
are bad articles that should be removed, but this is not
their decision.)

The current implementation of [[Special:Random]],
however, gives equal weight to every existing article and
this is perceived as a problem that needs to be fixed.

But it is not obvious how a bug report or feature
request should be written. A naive approach would be
to ask for a random article that wasn't created by a
bot, but this is not to the point. Users want bot
generated articles to come up, only not so often. And
some manually written article stubs are also less wanted.
Perhaps the random function should be weighted by
article length or by the number of page views? But is
it practical to implement such a weighted random
function? Are the necessary data in the database?


--
  Lars Aronsson (l...@aronsson.se)
  Aronsson Datateknik - http://aronsson.se



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Weighted random article

2013-08-22 Thread Tim Starling
On 23/08/13 10:48, Lars Aronsson wrote:
 But it is not obvious how a bug report or feature
 request should be written. A naive approach would be
 to ask for a random article that wasn't created by a
 bot, but this is not to the point. 

That was my solution when this issue came up on the English Wikipedia:

http://www.mediawiki.org/wiki/Special:Code/MediaWiki/4256

The configured SQL excluded pages most recently edited by Rambot.
Derek Ramsey was opposed to it, since he thought his US census stubs
deserved eyeballs just as much as any hand-written article, but IIRC I
managed to get this solution deployed, at least for a year or two.

 Users want bot
 generated articles to come up, only not so often. And
 some manually written article stubs are also less wanted.
 Perhaps the random function should be weighted by
 article length or by the number of page views? But is
 it practical to implement such a weighted random
 function? Are the necessary data in the database?

It would not be especially simple. The existing database schema does
not allow weighted random selection. A special data structure could be
used, or it could be implemented (inefficiently) in Lucene.

An approximation would be to select, say, 100 articles from the
database using page_random, then calculate a weight for each of those
100 articles using complex criteria, then do a weighted random
selection from those 100 articles.

Article length is in the database, but page view count is not.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Weighted random article

2013-08-22 Thread Lars Aronsson

On 08/23/2013 03:57 AM, Tim Starling wrote:

An approximation would be to select, say, 100 articles from the
database using page_random, then calculate a weight for each of those
100 articles using complex criteria, then do a weighted random
selection from those 100 articles.


Interesting. An even easier/coarser approximation
would be to make a second draw only when the
first draw doesn't meet some criteria (e.g.
bot-created, shorter than L bytes, lacks illustration).

On an average day, Special:Random (and its
translation Special:Slumpsida) seems to be
called some 9000 times on sv.wikipedia


--
  Lars Aronsson (l...@aronsson.se)
  Aronsson Datateknik - http://aronsson.se



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Weighted random article

2013-08-22 Thread Benjamin Lees
Just add all the non-bot articles to a category and use
Special:RandomInCategory. ;-)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l