[Wikitech-l] Re: How to get Top 1000 contributors list?

2022-06-25 Thread John
Depends on what you call the top editors. Is that edit count, articles
created, total bytes of content added etc.

On Sat, Jun 25, 2022 at 12:20 PM Roy Smith  wrote:

> This should do it:
>
> https://quarry.wmcloud.org/query/65641
>
> In general, this is an inefficient query because user_editcount isn't
> indexed.  Fortunately, tawiki has a relatively small number of users, so it
> works.  Running the same query on a much larger wiki, say en, would
> probably time out.
>
>
> On Jun 25, 2022, at 12:07 PM, Shrinivasan T 
> wrote:
>
> Hello all,
>
> I  like to get top 1000 contributors for ta.wikipedia.org based on their
> usercontribution metric.
>
> is there any code this or api?
>
> Please share.
>
> Thanks.
>
> --
> Regards,
> T.Shrinivasan
>
>
> My Life with GNU/Linux : http://goinggnu.wordpress.com
> Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com
>
> Get Free Tamil Ebooks for Android, iOS, Kindle, Computer :
> http://FreeTamilEbooks.com 
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
>
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Feature idea: data structure to improve translation capabilities

2021-06-20 Thread John
Off hand isn’t this something that wikidata was setup to handle?

On Sun, Jun 20, 2021 at 12:40 PM Wolter HV  wrote:

> Hello,
>
> I have been thinking of a way to organise data in Wiktionary that would
> allow
> for words to automatically show translations to other languages with much
> less
> work than is currently required.
>
> Currently, translations to other languages have to be added manually,
> meaning
> they are not automatically propagated across language pairs.  What I mean
> by
> this is showcased in the following example:
>
>  1. I create a page for word X in language A.
>  2. I create a page for word Y in language B.
>  3. I add a translation to the page for word X, and state that it
> translates to
> word Y in language B.
>  4. If I want the page for word Y to show that it translates to word X in
> language A, I have to do this manually.
>
> Automating this seems a bit tricky.  I think that the key is acknowledging
> that
> meanings can be separated from language and used as the links of
> translation.
> In this view, words and their definitions are language-specific, but
> meanings
> are language-agnostic.
>
> Because I may have done a bad job at explaining this context, I have
> created a
> short example in the form of an sqlite3 SQL script that creates a small
> dictionary database with two meanings for the word "desert"; one of the
> meanings has been linked to the corresponding words in Spanish and in
> German.
> The script mainly showcases how words can be linked across languages with
> minimal rework.
>
> You can find the script attached.  To experiment with this, simply run
>
> .read feature_showcase.sql
>
> within an interactive sqlite3 session.  (There may be other ways of doing
> it
> but this is how I tested it.)
>
> I believe this system can also be used to automate other word relations
> such as
> hyponyms and hypernyms, meronyms and holonyms, and others.  It can also
> allow
> looking up words in other languages and getting definitions in the
> language of
> choice.  In short, it would allow Wiktionary to more effortlessly function
> as
> a universal dictionary.
>
> Has something like this been suggested before?  I would be pleased to
> receive
> feedback on this idea.
>
> With kind regards,
> Wolter HV
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Re: [Wikitech-l] How to get a list of usercontribs for a given date range?

2020-11-18 Thread John
Thats not how those parameters work. You use either or, not both. You are
either going forward or backwards thru the contribs based on which you use.
You need to apply some logic on the application layer to filter the results
to what you need.

On Wed, Nov 18, 2020 at 6:50 AM Shrinivasan T 
wrote:

> Hello all,
>
> I want to get a usercontrib details of a wikisource user in given date
> range.
>
> Trying this
>
>
> https://ta.wikisource.org/wiki/Special:ApiSandbox#action=query=json=usercontribs=2020-11-14T19%3A34%3A26.000Z=2020-11-17T19%3A34%3A26.000Z=Fathima
> Shaila
>
> if I give only ucstart, it works fine.
> adding ucend gives 0 result.
>
> Need help on solving this.
>
> Share your thoughts on how to solve this.
>
> Thanks.
>
> --
> Regards,
> T.Shrinivasan
>
>
> My Life with GNU/Linux : http://goinggnu.wordpress.com
> Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com
>
> Get Free Tamil Ebooks for Android, iOS, Kindle, Computer :
> http://FreeTamilEbooks.com
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Making breaking changes without deprecation?

2020-09-01 Thread John
Honestly if you want a depreciation policy, warnings need to be omitted for
at least one 1.x version. Anything less than that is pointless from an end
user perspective. We tend to wait for final releases to limit bug exposure.
If something breaks, and it's not clear exactly what the cause is, using
incremental updates to figure out the breakage is the solution normally
applied. One of the key reasons to use a depreciation policy is to limit
the breakage that can happen, if the Mediawiki culture is shifting to a
screw non-published extensions policy might as well not have a depreciation
policy. However if the historical spirit of MW is maintained having such a
policy is critical. Mediawiki is re-used by a lot of different groups, not
all of them are able to or are willing to publish extension code for a
number of reasons. Taking the "Not my Problem" approach leaves a sour taste
in my mouth. Honestly if you cannot maintain compatibility for at least one
release cycle, how much damage are you going to create?

On Tue, Sep 1, 2020 at 8:58 AM Daniel Kinzler 
wrote:

> Hi Arthur!
>
> We were indeed thinking of different scenarios. I was thinking of someone
> who
> runs a wiki with a couple of one-off private extensions running, and now
> wants
> to update. They may well test that everything is still working with the new
> version of MediaWiki, but I think they would be unlikely to test with
> development settings enabled. The upgrade guide doesn't mention this, and
> even
> if it did, I doubt many people would remember to enable it. So they won't
> notice
> deprecations until the code is removed.
>
> I understand your scenario to refer to an extension developer explicitly
> testing
> whether their extension is working with the next release, and try it in
> their
> development environment. They would see the deprecation warnings, and
> address
> them. But in that scenario, would it be so much worse to see fatal errors
> instead of deprecation warnings?
>
> This is not meant to be a loaded question. I'm trying to understand what
> the
> practical consequences would be. Fatal errors are of course less nice, but
> in a
> testing environment, not a real problem, right? I suppose deprecation
> warnings
> can provide better information that fatal errors would, but one can also
> find
> this information in the release notes, once it is clear what to look for.
>
> Also note that this would only affect private extensions. Public extensions
> would receive support up front, and early removal of the obsolete code
> would be
> blocked until all known extensions are fixed.
>
> Thank you for your thoughts!
> -- daniel
>
>
> Am 31.08.20 um 20:54 schrieb Arthur Smith:
> > Hmm, maybe we're talking past one another here? I'm assuming a developer
> of an
> > extension who is interested in testing a new release - if we have a
> version that
> > has things deprecated vs completely removed, that allows a quick check
> to see if
> > the deprecated code affects them without going back into their own code
> (which
> > may have been developed partly by somebody else so  just reading release
> notes
> > wouldn't clue them in that there might be a problem).
>
> --
> Daniel Kinzler
> Principal Software Engineer, Core Platform
> Wikimedia Foundation
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Ethical question regarding some code

2020-08-08 Thread John Erling Blad
Please stop calling this an “AI” system, it is not. It is statistical
learning.

This is probably not going to make me popular…

In some jurisdictions you will need a permit to create, manage, and store
biometric identifiers, no matter if the biometric identifier is for a known
person or not. If you want to create biometric identifiers, and use them,
make darn sure you follow every applicable law and rule. I'm not amused by
the idea of having CUs using illegal tools to wet ordinary users.

Any system that tries to remove anonymity og users on Wikipedia should have
an RfC where the community can make their concerns heard. This is not the
proper forum to get acceptance from Wikipedias community.

And btw, systems for cleanup of prose exists for a whole bunch of
languages, not only English. Grammarly is one, LanguageTool another, and
there are a whole bunch other such tools.

lør. 8. aug. 2020, 19.42 skrev Amir Sarabadani :

> Thank you all for the responses, I try to summarize my responses here.
>
> * By closed source, I don't mean it will be only accessible to me, It's
> already accessible by another CU and one WMF staff, and I would gladly
> share the code with anyone who has signed NDA and they are of course more
> than welcome to change it. Github has a really low limit for people who can
> access a private repo but I would be fine with any means to fix this.
>
> * I have read that people say that there are already public tools to
> analyze text. I disagree, 1- The tools you mentioned are for English and
> not other languages (maybe I missed something) and even if we imagine there
> would be such tools for big languages like German and/or French, they don't
> cover lots of languages unlike my tool that's basically language agnostic
> and depends on the volume of discussions happened in the wiki.
>
> * I also disagree that it's not hard to build. I have lots of experience
> with NLP (with my favorite work being a tool that finds swear words in
> every language based on history of vandalism in that Wikipedia [1]) and
> still it took me more than a year (a couple of hours almost in every
> weekend) to build this, analyzing a pure clean text is not hard, cleaning
> up wikitext and templates and links to get only text people "spoke" is
> doubly hard, analyzing user signatures brings only suffer and sorrow.
>
> * While in general I agree if a government wants to build this, they can
> but reality is more complicated and this situation is similar to security.
> You can never be 100% secure but you can increase the cost of hacking you
> so much that it would be pointless for a major actor to do it. Governments
> have a limited budget and dictatorships are by design corrupt and filled
> with incompotent people [2] and sanctions put another restrain on such
> governments too so I would not give them such opportunity for oppersion in
> a silver plate for free, if they really want to, then they must pay for it
> (which means they can't use that money/resources on oppersing some other
> groups).
>
> * People have said this AI is easy to be gamed, while it's not that easy
> and the tools you mentioned are limited to English, it's still a big win
> for the integrity of our projects. It boils down again to increasing the
> cost. If a major actor wants to spread disinformation, so far they only
> need to fake their UA and IP which is a piece of cake and I already see
> that (as a CU) but now they have to mess with UA/IP AND change their
> methods of speaking (which is one order of magnitude harder than changing
> IP). As I said, increasing this cost might not prevent it from happening
> but at least it takes away the ability of oppressing other groups.
>
> * This tool never will be the only reason to block a sock. It's more than
> anything a helper, if CU brings a large range and they are similar but the
> result is not conclusive, this tool can help. Or when we are 90% sure it's
> a WP:DUCK, this tool can help too but blocking just because this tool said
> so would imply a "Minority report" situation and to be honest and I would
> really like to avoid that. It is supposed to empower CUs.
>
> * Banning using this tool is not possible legally, the content of Wikipedia
> is published under CC-BY-SA and this allows such analysis specially you
> can't ban an offwiki action. Also, if a university professor can do it, I
> don't see the point of banning using it by the most trusted group of users
> (CUs). You can ban blocking based on this tool but I don't think we should
> block solely based on this anyway.
>
> * It has been pointed out by people in the checkuser mailing list that
> there's no point in logging accessing this tool, since the code is
> accessible to CUs (if they want to), so they can download and run it on
> their computer without logging anyway.
>
> * There is a huge difference between CU and this AI tool in matters of
> privacy. While both are privacy sensitive but CU reveals much more, as a
> CU, I know 

Re: [Wikitech-l] Ethical question regarding some code

2020-08-06 Thread John Erling Blad
For those interested; the best solution as far as I know for this kind of
similarity detection is the Siamese network with RNNs in the first part.
That implies you must extract fingerprints for all likely candidates
(users) and then some to create a baseline. You can not simply claim that
two users (adversary and postulated sock) are the same because they have
edited the same page. It is quite unlikely a user will edit the same page
with a sock puppet, when it is known that such a system is activated.

On Thu, Aug 6, 2020 at 10:49 PM John Erling Blad  wrote:

> Nice idea! First time I wrote about this being possible was back in
> 2008-ish.
>
> The problem is quite trivial, you use some observable feature to
> fingerprint an adversary. The adversary can then game the system if the
> observable feature can be somehow changed or modified. To avoid this the
> observable features are usually chosen to be physical properties that can't
> be easily changed.
>
> In this case the features are word and/or relations between words, and
> then the question is “Can the adversary change the choice of words?” Yes he
> can, because the choice of words is not an inherent physical property of
> the user. In fact there are several programs that help users express
> themselves in a more fluent way, and such systems will change the
> observable features i.e. choice of words. The program will move the
> observable features (the words) from one user-specific distribution to
> another more program-specific distribution. You will observe the users a
> priori to be different, but with the program they will be a posteriori more
> similar.
>
> A real problem is your own poisoning of the training data. That happens
> when you find some subject to be the same as your postulated one, and then
> feed the information back into your training data. If you don't do that
> your training data will start to rot because humans change over time. It is
> bad anyway you do it.
>
> Even more fun is an adversary that knows what you are doing, and tries to
> negate your detection algorithm, or even fool you into believing he is
> someone else. It is after all nothing more than word count and statistics.
> What will you do when someone edits a Wikipedia-page and your system tells
> you “This revision is most likely written by Jimbo”?
>
> Several such programs exist, and I'm a bit perplexed that they are not in
> more use among Wikipedia's editors. Some of them are more aggressive, and
> can propose quite radical rewrites of the text. I use one of them, and it
> is not the best, but still it corrects me all the time.
>
> I believe it would be better to create a system where users are internally
> identified and externally authenticated. (The previous is biometric
> identification, and must adhere to privacy laws.)
>
> On Thu, Aug 6, 2020 at 4:33 AM Amir Sarabadani 
> wrote:
>
>> Hey,
>> I have an ethical question that I couldn't answer yet and have been asking
>> around but no definite answer yet so I'm asking it in a larger audience in
>> hope of a solution.
>>
>> For almost a year now, I have been developing an NLP-based AI system to be
>> able to catch sock puppets (two users pretending to be different but
>> actually the same person). It's based on the way they speak. The way we
>> speak is like a fingerprint and it's unique to us and it's really hard to
>> forge or change on demand (unlike IP/UA), as the result if you apply some
>> basic techniques in AI on Wikipedia discussions (which can be really
>> lengthy, trust me), the datasets and sock puppets shine.
>>
>> Here's an example, I highly recommend looking at these graphs, I compared
>> two pairs of users, one pair that are not sock puppets and the other is a
>> pair of known socks (a user who got banned indefinitely but came back
>> hidden under another username). [1][2] These graphs are based one of
>> several aspects of this AI system.
>>
>> I have talked about this with WMF and other CUs to build and help us
>> understand and catch socks. Especially the ones that have enough resources
>> to change their IP/UA regularly (like sock farms, and/or UPEs) and also
>> with the increase of mobile intern providers and the horrible way they
>> assign IP to their users, this can get really handy in some SPI ("Sock
>> puppet investigation") [3] cases.
>>
>> The problem is that this tool, while being built only on public
>> information, actually has the power to expose legitimate sock puppets.
>> People who live under oppressive governments and edit on sensitive topics.
>> Disclosing such connections between two accounts can cost people their
>> lives.
>>
>> So, this cod

Re: [Wikitech-l] Ethical question regarding some code

2020-08-06 Thread John Erling Blad
Nice idea! First time I wrote about this being possible was back in
2008-ish.

The problem is quite trivial, you use some observable feature to
fingerprint an adversary. The adversary can then game the system if the
observable feature can be somehow changed or modified. To avoid this the
observable features are usually chosen to be physical properties that can't
be easily changed.

In this case the features are word and/or relations between words, and then
the question is “Can the adversary change the choice of words?” Yes he can,
because the choice of words is not an inherent physical property of the
user. In fact there are several programs that help users express themselves
in a more fluent way, and such systems will change the observable features
i.e. choice of words. The program will move the observable features (the
words) from one user-specific distribution to another more program-specific
distribution. You will observe the users a priori to be different, but with
the program they will be a posteriori more similar.

A real problem is your own poisoning of the training data. That happens
when you find some subject to be the same as your postulated one, and then
feed the information back into your training data. If you don't do that
your training data will start to rot because humans change over time. It is
bad anyway you do it.

Even more fun is an adversary that knows what you are doing, and tries to
negate your detection algorithm, or even fool you into believing he is
someone else. It is after all nothing more than word count and statistics.
What will you do when someone edits a Wikipedia-page and your system tells
you “This revision is most likely written by Jimbo”?

Several such programs exist, and I'm a bit perplexed that they are not in
more use among Wikipedia's editors. Some of them are more aggressive, and
can propose quite radical rewrites of the text. I use one of them, and it
is not the best, but still it corrects me all the time.

I believe it would be better to create a system where users are internally
identified and externally authenticated. (The previous is biometric
identification, and must adhere to privacy laws.)

On Thu, Aug 6, 2020 at 4:33 AM Amir Sarabadani  wrote:

> Hey,
> I have an ethical question that I couldn't answer yet and have been asking
> around but no definite answer yet so I'm asking it in a larger audience in
> hope of a solution.
>
> For almost a year now, I have been developing an NLP-based AI system to be
> able to catch sock puppets (two users pretending to be different but
> actually the same person). It's based on the way they speak. The way we
> speak is like a fingerprint and it's unique to us and it's really hard to
> forge or change on demand (unlike IP/UA), as the result if you apply some
> basic techniques in AI on Wikipedia discussions (which can be really
> lengthy, trust me), the datasets and sock puppets shine.
>
> Here's an example, I highly recommend looking at these graphs, I compared
> two pairs of users, one pair that are not sock puppets and the other is a
> pair of known socks (a user who got banned indefinitely but came back
> hidden under another username). [1][2] These graphs are based one of
> several aspects of this AI system.
>
> I have talked about this with WMF and other CUs to build and help us
> understand and catch socks. Especially the ones that have enough resources
> to change their IP/UA regularly (like sock farms, and/or UPEs) and also
> with the increase of mobile intern providers and the horrible way they
> assign IP to their users, this can get really handy in some SPI ("Sock
> puppet investigation") [3] cases.
>
> The problem is that this tool, while being built only on public
> information, actually has the power to expose legitimate sock puppets.
> People who live under oppressive governments and edit on sensitive topics.
> Disclosing such connections between two accounts can cost people their
> lives.
>
> So, this code is not going to be public, period. But we need to have this
> code in Wikimedia Cloud Services so people like CUs in other wikis be able
> to use it as a web-based tool instead of me running it for them upon
> request. But WMCS terms of use explicitly say code should never be
> closed-source and this is our principle. What should we do? I pay a
> corporate cloud provider for this and put such important code and data
> there? We amend the terms of use to have some exceptions like this one?
>
> The most plausible solution suggested so far (thanks Huji) is to have a
> shell of a code that would be useless without data, and keep the code that
> produces the data (out of dumps) closed (which is fine, running that code
> is not too hard even on enwiki) and update the data myself. This might be
> doable (which I'm around 30% sure, it still might expose too much) but it
> wouldn't cover future cases similar to mine and I think a more long-term
> solution is needed here. Also, it would reduce the bus factor to 

[Wikitech-l] T65602

2020-05-03 Thread John
This ticket has been sitting around for just over 6 years now, any chance
it can get some love? Getting this to use an existing generator such as the
action=parse=Sandbox=externallinks shouldnt be too difficult
hopefully.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Render with a slow process

2020-04-25 Thread John Erling Blad
Slow process, fast rendering

Imagine someone edits a page, and that editing hits a very slow
tag-function of some kind. You want to respond fast with something
readable, some kind of temporary page until the slow process has finished.
Do you chose to reuse what you had from the last revision, if it the
content of the function hasn't changed, or do you respond with a note that
you are still processing? The temporary page could then update missing
content through an API.

I assume that a plain rerender of the page will occur after the actual
content is updated and available, i.e. a rerender will only happen some
time after the edit and the slow process would then be done. A last step of
the process could then be to purge the temporary page.

This work almost everywhere, but not for collections, they don't know about
temporary pages. Is there some mechanism implemented that does this or
something similar? It feels like something that should have a general
solution.

There are at least two different use cases; one where some existing
information gets augmented with external data (without really changing the
content), and one where some external data (could be content) gets added to
existing content. An example of the first could be verification of
references, wile the later could be natural language generation.

/jeblad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Security Team collaboration

2020-01-28 Thread John Bennett
Hello,

In an effort to create a repeatable and streamlined process for consumption
of security services the Security Team has been working on changes and
improvements to our workflows. Much of this effort is an attempt to
consolidate work intake for our team in order to more effectively
communicate status, priority and scheduling.  This is step 1 and we expect
future changes as our tooling, capabilities and processes mature.

*How to collaborate with the Security Team*

The Security Team works in an iterative manner to build new and mature
existing security services as we face new threats and identify new risks.
For a list of currently deployed services please review our services [1]
page.

The initial point of contact for the majority of our services is now a
consistent Request For Services [2] (RFS) form [3].

The two workflow exceptions to RFS are the Privacy Engineering [4] service
and Security Readiness Review [5] process which already had established
methods that are working well.

If the RFS forms are confusing or don't lead you to the answers you need
try security-h...@wikimedia.org  to get assistance with finding the right
service, process, or person

secur...@wikimedia.org will continue to be our primarily external reporting
channel

*Coming changes in Phabricator*

We will be disabling the workboard on the #Privacy [6] project.  This
workboard is not actively or consistently cultivated and often confuses
those who interact with it.  #Privacy is a legitimate tag to be used in
many cases, but the resourced privacy contingent within the Security Team
will be using the #privacy engineering [7] component.

We will be disabling the workboard for the #Security [8] project.  Like the
#Privacy project this workboard is not actively or consistently cultivated
and is confusing.  Tasks which are actively resourced should have an
associated group [9] tag such as #Security Team [10].

The #Security project will be broken up into subprojects [11] with
meaningful names that indicate user relation to the #Security landscape.
This is in service to #Security no longer serving double duty as an ACL and
a group project.  An ACL*Security-Issues project will be created and
#Security will still be available to link cross cutting issues, but will
also allow equal footing for membership for all Phabricator users.

*Other Changes*

A quick callout to the consistency [12] and Gerrit sections of our team
handbook [13].  As a team we have agreed that all changesets we interact on
need a linked task with the #security-team tag.

security@ will soon be managed as a Google group collaborative inbox [14]
as outlined in T243446.

Thanks
John

[1] Security Services
https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Services
[2] Security RFS docs
https://www.mediawiki.org/wiki/Security/SOP/Requests_For_Service
[3] RFS form
https://phabricator.wikimedia.org/maniphest/task/edit/form/72/
[4] Privacy Engineering RFS
https://form.asana.com/?hash=554c8a8dbf8e96b2612c15eba479287f9ecce3cbaa09e235243e691339ac8fa4=1143023741172306
[5] Readiness Review SOP
https://www.mediawiki.org/wiki/Security/SOP/Security_Readiness_Reviews
[6] Phab Privacy tag
https://phabricator.wikimedia.org/tag/privacy/
[7] Privacy Engineering Project
https://phabricator.wikimedia.org/project/view/4425/
[8] Security Tag
https://phabricator.wikimedia.org/tag/security/
[9] Phab Project types
https://www.mediawiki.org/wiki/Phabricator/Project_management#Types_of_Projects
[10] Security Team tag
https://phabricator.wikimedia.org/tag/security-team/
[11] Security Sub Projects
https://phabricator.wikimedia.org/project/subprojects/4420/
[12] Security Team Handbook
https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Handbook#Consistency
[13] Secteam handbook-gerrit
https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Handbook#Gerrit
[14] Google collab inbox
https://support.google.com/a/answer/167430?hl=en
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] New year / Nova godina

2019-12-31 Thread John Shepherd
Wishing the same to you! :)

On 2019-12-31 at 3:31 PM, Zoran Dori wrote:

> Happy new year to everyone!! / Srećna Nova godina svima!!
> 
> Best wishes! / Sve najbolje!
> 
> Zoran.
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] URL parameter fetching (mediawiki)

2019-10-01 Thread John Shepherd
Is there a “special” way in mediawiki to get parameters passed by URL or do you 
just use the conventional PHP way ($_GET)? I am trying to check if 
“Special:CreateAccount?reason=“ is set/has a value (and what that value is).

Thank you for your assistance, trying to be convention compliant.
-TheSandDoctor

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] 503 Backend fetch failed

2019-10-01 Thread John
I’ve seen it a number of times often just redoing the request makes it go
away

On Tue, Oct 1, 2019 at 9:06 AM Bináris  wrote:

> Hi folks,
> I got this message while running a bot:
> Result: 503 Backend fetch failed
>
> I know HTTP 503 message, but I never got it during botwork. I restarted it
> successfully. Is this normal?
>
> --
> Bináris
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Neural nets, and Lua?

2019-09-26 Thread John Erling Blad
Forgot to say why this is important. Neural nets, especially recurrent
neural nets (RNN),
can do inflection and thus make reuse of Wikidata statements possible
inside the text.
A lot of languages have quite complex rules for inflection and agreement.

An alternative to RNN is finite state transfer (FSM).

On Thu, Sep 26, 2019 at 3:03 PM John Erling Blad  wrote:

> A project that could be really interesting is to make a Lua interface for
> some of the new neural nets, especially based on the Tsetlin-engine. Sounds
> nifty, but it is nothing more than a slight reformulation of an old
> learning algorithm (type early 70th), where the old algorithm has problem
> converging for bad training data (ie. not separable). What is really nice
> is that a trained network is extremely efficient, as it is mostly just
> bit-operations or add-operations. Which means we can make rather fancy
> classifiers that run in the web servers, and thus without any delayed
> update of the pages.
>
> The bad thing is that the training must be done offline, because that is
> nowhere near lightweight.
>
> Ordinary classifiers seems to work well, that is equivalents to fully
> connected layers. Also some types of convolutional layers. Some regressions
> can be done, but the networks are binary in nature, and mapping to and from
> linear scaling adds complexity.
>
> But running neural nets inside a PHP-based web server… I doubt we would
> hit the 10 sec limit for a Lua module even if we added several such
> networks.
>
> Ok, to much coffee today…
>
> John
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Neural nets, and Lua?

2019-09-26 Thread John Erling Blad
A project that could be really interesting is to make a Lua interface for
some of the new neural nets, especially based on the Tsetlin-engine. Sounds
nifty, but it is nothing more than a slight reformulation of an old
learning algorithm (type early 70th), where the old algorithm has problem
converging for bad training data (ie. not separable). What is really nice
is that a trained network is extremely efficient, as it is mostly just
bit-operations or add-operations. Which means we can make rather fancy
classifiers that run in the web servers, and thus without any delayed
update of the pages.

The bad thing is that the training must be done offline, because that is
nowhere near lightweight.

Ordinary classifiers seems to work well, that is equivalents to fully
connected layers. Also some types of convolutional layers. Some regressions
can be done, but the networks are binary in nature, and mapping to and from
linear scaling adds complexity.

But running neural nets inside a PHP-based web server… I doubt we would hit
the 10 sec limit for a Lua module even if we added several such networks.

Ok, to much coffee today…

John
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] LoginSignupSpecialPage.php form won’t let me add fields

2019-09-21 Thread John Shepherd
Forgot to add that this is a brand new install without any extensions.

On 2019-09-21 at 10:00 AM, John Shepherd wrote:

> Hello all,
> 
> On my own mediawiki install, I am trying to add another checkbox field to the 
> Special:CreateAccount page. I have found the code responsible for the form, 
> but for some reason the checkbox does not show up. As a test, I then went and 
> tried copying and pasting one of the existing text boxes (with its IDs etc 
> changed of course) to see if that would work. Nothing shows up other than the 
> fields already present.
> 
> Does anyone have any ideas what could be blocking it and/or what I am 
> missing? Below is the diff of the change that doesn’t show.
> 
> https://github.com/TheSandDoctor/misc-code-bits/commit/4f2f6221c64095777622219c6c04c174eb197597
> 
> Thanks!
> TheSandDoctor
> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] LoginSignupSpecialPage.php form won’t let me add fields

2019-09-21 Thread John Shepherd
Hello all,

On my own mediawiki install, I am trying to add another checkbox field to the 
Special:CreateAccount page. I have found the code responsible for the form, but 
for some reason the checkbox does not show up. As a test, I then went and tried 
copying and pasting one of the existing text boxes (with its IDs etc changed of 
course) to see if that would work. Nothing shows up other than the fields 
already present.

Does anyone have any ideas what could be blocking it and/or what I am missing? 
Below is the diff of the change that doesn’t show.

https://github.com/TheSandDoctor/misc-code-bits/commit/4f2f6221c64095777622219c6c04c174eb197597

Thanks!
TheSandDoctor

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] content_models table only contains wikitext content model on fresh MW 1.32.3 install

2019-09-19 Thread John
Why don’t you use the existing import/export tools?

On Thu, Sep 19, 2019 at 8:18 AM Tom Schulze <
t.schu...@energypedia-consult.com> wrote:

>
> > There have been reports of similar problems with the slots table. Please
> add
> > your experience to the ticket here:
> >
> > https://phabricator.wikimedia.org/T224949
> >
> > There is a patch up that should safeguard against my best guess at the
> cause of
> > this. If you can provide additional insights as to exactly how this may
> happen,
> > please do!
> >
> >
> Thank you for your quick reply and for pointing me to the right
> direction. I am not sure if it's a mistake on my side, otherwise I'll
> gladly contribute.
>
> I assume that the content_model id is lost/not generated somewhere
> between my clean MW install and the import of my templates via a script.
> I import pages using a custom maintenance script which reads a files'
> content from the file system and saves it to the mediawiki db using:
>
> $title = Title::newFromText('Widget:MyWidget');
> $wikiPage = new WikiPage( $title );
> $newContent = ContentHandler::makeContent( $contentFromFile, $title );
> $wikiPage->doEditContent( $newContent );
>
> In the MW Class reference
> <
> https://doc.wikimedia.org/mediawiki-core/master/php/classContentHandler.html#a2f403e52fb305523b0812f37de41622d
> >
> it says  "If [the modelId parameter for ContentHandler::makeContent()
> is] not provided, $title->getContentModel() is used." I assume, that it
> checks the namespace among others and uses javascript for Widgets?
> Because in my case it's a widget that causes the error. The extension is
> installed prior to the importation and the namespace 'Widget' exists.
>
> Is there something wrong with the snippet?
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Cryptographic puzzles as mitigation for DDoS

2019-09-07 Thread John Erling Blad
Cryptographic puzzles are used to slow down an attack, by stopping the
attacker from flooding the servers. It will not stop him from flooding the
network, but usually that is a rather hard task if he can not establish a
connection with the servers.

On Sat, Sep 7, 2019 at 1:57 PM Alex Monk  wrote:

> I was under the (possibly mistaken) impression that the attacker was just
> flooding the network with traffic?
>
> On Sat, 7 Sep 2019, 12:25 John Erling Blad,  wrote:
>
> > There are several papers about how to stop DDoS by using cryptographic
> > puzzles.[1] The core idea is to give the abuser some algorithmic work he
> > has to solve, thereby forcing him to waste processing power, and then to
> > slow him down to a manageable level.[2] That only work if you are the
> > target, and not some intermediary are targeted.
> >
> > Could it be a solution for the WMF servers?
> >
> > [1] http://d-scholarship.pitt.edu/24944/1/mehmud_abliz_dissertation.pdf
> > (just a random pick)
> > [2]
> >
> >
> https://searchsecurity.techtarget.com/answer/TLS-protocol-Can-a-client-puzzle-improve-security
> > (about
> > TLS, but can also be done at the application level)
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Cryptographic puzzles as mitigation for DDoS

2019-09-07 Thread John Erling Blad
There are several papers about how to stop DDoS by using cryptographic
puzzles.[1] The core idea is to give the abuser some algorithmic work he
has to solve, thereby forcing him to waste processing power, and then to
slow him down to a manageable level.[2] That only work if you are the
target, and not some intermediary are targeted.

Could it be a solution for the WMF servers?

[1] http://d-scholarship.pitt.edu/24944/1/mehmud_abliz_dissertation.pdf
(just a random pick)
[2]
https://searchsecurity.techtarget.com/answer/TLS-protocol-Can-a-client-puzzle-improve-security
(about
TLS, but can also be done at the application level)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Abusefilter and Content Translator

2019-04-28 Thread John Erling Blad
It is either the limit for unedited automatic translations that is set way
to high, or an admin that blames Google for whatever translated text (s)he
find. The later is not uncommon in Norwegian, even if the admins are told
several times ContentTranslation does not use Google for Norwegian
translations. (There might be some weird language where it is used.)

Not sure where I found it, but someone (somewhere) has said that the
ideographic signs in Chinese is fairly easy to translate. (Don't tell any
Chinese ideographic signs are easy, they believe they are extremely
difficult!) It is similar to the Nynorsk – Bokmål problem, the limits are
set fairly strict, meaning a lot of content has to change, which implies
publishing of the text will be blocked.

Which reminds me that the limits for Nynorsk – Bokmål should be further
relaxed.

Use of simple limits has a nasty effect; users tend to add filler-words to
increase the amount of edited text, instead they should delete unnecessary
constructs.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-24 Thread John Erling Blad
Sorry, but this is not valid. I can't leave this uncommented.

Assume the article is right, then all metrics would be bad. Thus we
can't find any example that contradicts the statement in the article.

If we pick coverage of automated tests as a metric, then _more_ test
coverage would be bad given the article pretense. Clearly there can be
bad tests, like any code, but assume the tests are valid, would
increasing the coverage be bad as such? Clearly no.

Pick another example, like cyclomatic complexity. Assume some code
controls what or how we measure cc. If we change this code so _more_
code is covered by CC-measurements, then this would be bad given the
articles pretense. Again clearly no.

Yet another one, code duplication. Assume some code measure code bloat
by a simple duplication test. Testing more code for code bloat would
then be bad, given the article pretense. Would all code duplication be
bad? Not if you must keep speed up in tight loops. So perhaps you may
say a metric for code duplication could be wrong sometimes.

Measuring code quality is completely valid, as is measuring article
quality. The former is disputed, but the later is accepted as a
GoodThing™ by the same programmers. Slightly rewritten; "Don't count
me, I'll count you!"

On Thu, Mar 21, 2019 at 7:07 AM Gergo Tisza  wrote:
>
> On Wed, Mar 20, 2019 at 2:08 PM Pine W  wrote:
>
> > :) Structured data exists regarding many other subjects such as books and
> > magazines. I would think that a similar approach could be taken to
> > technical debt. I realize that development tasks have properties and
> > interactions that change over time, but I think that having a better
> > quantitative understanding of the backlog would be good and would likely
> > improve the quality of planning and resourcing decisions.
> >
>
> Focusing on metrics is something bad managers tend to do when they don't
> have the skills or knowledge to determine the actual value of the work.
> It's a famous anti-pattern. I'll refer you to the classic Spolsky article:
> https://www.joelonsoftware.com/2002/07/15/20020715/
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-24 Thread John Erling Blad
It is a strange discussion, especially as it is now about how some
technical debts are not _real_ technical debts. You have some code,
and you change that code, and breakage emerge both now and for future
projects. That creates a technical debt. Some of it has a more
pronounced short time effect (user observed bugs), and some of has a
more long term effect (it blocks progress). At some point you must fix
all of them.

On Thu, Mar 21, 2019 at 11:10 PM Pine W  wrote:
> It sounds like we have different perspectives. However, get the impression
> that people are getting tired of the this topic, so I'll move on.

I don't think this will be solved, so "move on" seems like an obvious choice.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-19 Thread John Erling Blad
On Tue, Mar 19, 2019 at 12:53 PM bawolff  wrote:
>
> Technical debt is by definition "ickyness felt by devs". It is a thing that
> can be worked on. It is not the only thing to be worked on, nor should it
> be, but it is one aspect of the system to be worked on. If its ignored it
> makes it really hard to fix bugs because then devs wont understand how the
> software works. If tech debt is worked on at the expense of everything
> else, that is bad too (like cleaning your house for a week straight without
> stopping to eat-bad outcomes) By definition it is not new features nor is
> it ickyness felt by users. It might help with bugs felt by users as often
> they are the result of devs misunderstanding what is going on, but that is
> a consequence not the thing itself.

The devs is not the primary user group, and they never will be. An
editor is a primary user, and (s)he has no idea where the letters
travels or how they are stored. A reader is a primary user, and
likewise (s)he has no idea how the letters emerge on the screen.

The devs are just one of several in a stakeholder group, and focusing
solely on whatever ickyness they feel is like building a house by
starting calling the plumber.

> Sales dept usually dont advocate for bug fixing as that doesnt sell
> products, new features do, so i dont know why you are bringing them up.
> They also dont usually deal with technical debt in the same way somebody
> who has never been to your house cant give you effective advice on how to
> clean it.

A sales dep is in contact with the customer, which is a primary user
of the product. If you don't like using the sales department, then say
you have a support desk that don't report bugs. Without anyone
reporting the bugs the product is dead.

Actually this is the decade old fight over "who owns the product". The
only solution is to create a real stakeholder group.

> That said, fundamentally you want user priorities (or at least *your*
> priorities. Its unclear if your priorities reflect the user base at large)
> to be taken into consideration when deciding developer priorities? Well
> step 1 is to define what you want. The wmf obviously tries to figure out
> what is important to users, and its pretty obvious in your view they are
> failing. Saying people are working on the wrong thing without saying what
> they should work on instead is a self-fulfiling prophecy.

Not going to answer this, it is an implicit blame game

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-19 Thread John Erling Blad
On Mon, Mar 18, 2019 at 10:52 PM bawolff  wrote:
>
> First of all, I want to say that I wholeheartedly agree with everything tgr
> wrote.
>
> Regarding Pine's question on technical debt.
>
> Technical debt is basically a fancy way of saying something is "icky". It
> is an inherently subjective notion, and at least for me, how important
> technical debt is depends a lot on how much my subjective sensibilities on
> what is icky matches whoever is talking about technical debt.
>
> So yes, I think everyone agrees icky stuff is bad, but sometimes different
> people have different ideas on what is icky and how much ickiness the icky
> things contain. Furthermore there is a trap one can fall into of only
> fixing icky stuff, even if its only slightly icky, which is bad as then you
> don't actually accomplish anything else. As with everything else in life,
> moderation is the best policy (imo).
>
> --
> Brian

To set degree of ickyness you need a stakeholdergroup, which is often
just the sales department. When you neither have a stakeholder group
or sales department you tend to end up with ickyness set by the devs,
and then features win over bugs. Its just the way things are.

I believe the ickyness felt by the editors must be more visible to the
devs, and the actual impact the devs do on bugs to lower the ickyness
must be more visible to the editors.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-19 Thread John Erling Blad
On Sun, Mar 17, 2019 at 2:38 PM C. Scott Ananian  wrote:
>
> A secondary issue is that too much wiki dev is done by WMF/WMFDE employees
> (IMO); I don't think the current percentages lead to an overall healthy
> open source community. But (again in my view) the first step to nurturing
> and growing our non-employee contributors is to make sure their patches are
> timely reviewed.
>   --scott

I find this argument strange, as it imply there is some kind of
magical difference between contributions from an employee and a
community member. There are no such difference. Both the employee and
the community member should take responsibility for the code base, but
that does not imply they should take the same actions on that code
base.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-19 Thread John Erling Blad
> On Sat, Mar 16, 2019 at 8:23 AM Strainu  wrote:
>
> > A large backlog by itself is not alarming. A growing one for
> > components deployed to WMF sites is. It indicates insufficient
> > attention is given to ongoing maintenance of projects after they are
> > no longer "actively developed", which in turn creates resentment with
> > the reporters.
> >
>
On Sun, Mar 17, 2019 at 10:22 PM Gergo Tisza  wrote:
>
> It really doesn't. The backlog is the contact surface between stuff that
> exists and stuff that doesn't; all the things we don't have but which seem
> realistically within reach. As functionality expands, that surface expands
> too. It is a normal process.
>

This isn't quite right, it only hold in some kind of simplified and
idealized environment.

There are several axis, not only what exist. For example existing and
non-existing features might be on the same axis, while it is hard to
say that functional vs non-functional code is on the same axis. If you
say these two are on the same axis, "stuff that exists", then you end
up arguing fixing bugs would be a problem as it expands the feature
space, thus will increase the total space and then increase the
technical debt.

This will imply that introducing a critical bug will solve the
technical debt, as the contact space will collapse. Fairly an
acceptable solution! ;D

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit outage

2019-03-17 Thread John Bennett
Hello,

Today we have seen Phabricator vandalism from an attacker who was also
responsible for the Gerrit outage yesterday. I’d like to clarify a comment
I made yesterday and provide as many additional details as I can while
still maintaining operational security.

While no user accounts were compromised the attacker leveraged a
vulnerability in Gerrit to comprise a single staff account. This discovery
is what lead to taking Gerrit offline so an investigation could occur, the
vulnerability could be remediated and the service restored.  However, no
further evidence of compromise was discovered and additional security
controls prevented malicious activities from being executed using the
compromised staff account. We will continue to monitor the situation and
will provide updates on this list and on the Phabricator task
https://phabricator.wikimedia.org/T218472.

Thanks

John


On Sat, Mar 16, 2019 at 2:25 PM John Bennett  wrote:

> Hello,
>
> Gerrit is available again but we are continuing to investigate the
> suspicious activity.  Our preliminary findings point to no users or
> production systems being compromised and no loss of any confidential
> information. As we continue to investigate over the next few days we will
> add any appropriate updates to the phabricator task (
> https://phabricator.wikimedia.org/T218472 ) .
>
> Thanks
>
>
> On Sat, Mar 16, 2019 at 10:26 AM John Bennett 
> wrote:
>
>> Hello,
>>
>>
>> On 16 March 2019, Wikimedia Foundation staff observed suspicious activity
>> associated with Gerrit and as a precautionary step has taken Gerrit offline
>> pending investigation.
>>
>>
>> The Wikimedia Foundation's Security, Site Reliability Engineering and
>> Release Engineering teams are investigating this incident as well as
>> potential improvements to prevent future incidents. More information will
>> be posted on Phabricator (https://phabricator.wikimedia.org/T218472 ) as
>> it becomes available and is confirmed. If you have any questions, please
>> contact the Security (secur...@wikimedia.org
>> ).
>>
>>
>> Thanks
>>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit outage

2019-03-16 Thread John Bennett
Hello,

Gerrit is available again but we are continuing to investigate the
suspicious activity.  Our preliminary findings point to no users or
production systems being compromised and no loss of any confidential
information. As we continue to investigate over the next few days we will
add any appropriate updates to the phabricator task (
https://phabricator.wikimedia.org/T218472 ) .

Thanks


On Sat, Mar 16, 2019 at 10:26 AM John Bennett 
wrote:

> Hello,
>
>
> On 16 March 2019, Wikimedia Foundation staff observed suspicious activity
> associated with Gerrit and as a precautionary step has taken Gerrit offline
> pending investigation.
>
>
> The Wikimedia Foundation's Security, Site Reliability Engineering and
> Release Engineering teams are investigating this incident as well as
> potential improvements to prevent future incidents. More information will
> be posted on Phabricator (https://phabricator.wikimedia.org/T218472 ) as
> it becomes available and is confirmed. If you have any questions, please
> contact the Security (secur...@wikimedia.org
> ).
>
>
> Thanks
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Gerrit outage

2019-03-16 Thread John Bennett
Hello,


On 16 March 2019, Wikimedia Foundation staff observed suspicious activity
associated with Gerrit and as a precautionary step has taken Gerrit offline
pending investigation.


The Wikimedia Foundation's Security, Site Reliability Engineering and
Release Engineering teams are investigating this incident as well as
potential improvements to prevent future incidents. More information will
be posted on Phabricator (https://phabricator.wikimedia.org/T218472 ) as it
becomes available and is confirmed. If you have any questions, please
contact the Security (secur...@wikimedia.org 
).


Thanks
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-14 Thread John Erling Blad
Yes, there should always be a response to all bugs. Without a response
the impression in the reporting wiki-community would be "nobody cares
about our bug reports".

Someone in the community finds a bug, and it is posted and discussed
in the community. Then another one writes a report in a task at
Phabricator, but nothing further happen.  A couple of months later the
first one ask again about the bug, but does not get a satisfactory
answer, and gets angry. This usually happen in cycles of a few months
to a year. We must somehow break those cycles, they are bad and
disruptive and creates a "us and them" attitude.

Users from the wiki-communities don't visit Phabricator to see all
those small administrative tasks, they see the notes from the official
and unofficial tech ambassadors, and they see the changes in the
"tracked" templates. The templates are only changed when the bugs are
closed for whatever reason, which could take years. Creating
additional manual interventions does not work, the process must be
simpler and more efficient.

On Thu, Mar 14, 2019 at 1:23 PM Andre Klapper  wrote:
>
> On Tue, 2019-03-12 at 00:29 +0100, John Erling Blad wrote:
> > It seems like some projects simply put everything coming from external
> > sources into deep freezer or add "need volunteer". If they respond at
> > all. In some cases it could be that the projects are defunc.
>
> What's the expectation based on that there should always be a response?
> If a bug report has all info needed to allow someone to reproduce and
> work on it, anyone is free to pick it up and work on it if anyone is
> interested in working on it. No further response needed.
>
> Cheers,
> andre
> --
> Andre Klapper | Bugwrangler / Developer Advocate
> https://blogs.gnome.org/aklapper/
>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-14 Thread John Erling Blad
Sorry, but I try to point out that the process is broken and give a
few examples on how to fix the process.

On Thu, Mar 14, 2019 at 1:20 PM Andre Klapper  wrote:
>
> On Thu, 2019-03-14 at 12:35 +0100, John Erling Blad wrote:
> > Blame games does not fix faulty processes.
>
> Hmm, why is this thread called "Question to WMF" instead of "Question
> to developers"?
>
> > Why do we have bugs that isn't handled for years?
>
> Basically: Because you did not fix these bugs. Longer version:
> https://www.mediawiki.org/wiki/Bug_management/Development_prioritization
>
> > Why is it easier to get a new feature than fixing an old bug?
>
> {{Citation needed}}.
> If that was the case: Because your priority was to write code for a new
> feature instead of fixing an old bug. Longer version:
> https://www.mediawiki.org/wiki/Bug_management/Development_prioritization
>
> > Google had a problem with unfixed bugs, and they started identifying
> > the involved developers each time the build was broken. That is pretty
> > harsh, but what if devs somehow was named when their bugs were
> > mentioned? What if there were some kind of public statistic? How would
> > the devs react to being identified with a bug? Would they fix the bug,
> > or just be mad about it? Devs at some of Googles teams got mad, but in
> > the end the code were fixed. Take a look at "GTAC 2013 Keynote:
> > Evolution from Quality Assurance to Test Engineering" [1]
>
> Not really - I see 6 open bug reports in Chromium, for example:
> https://bugs.chromium.org/p/chromium/issues/list
> (Only if you want to imply that only "Google" was responsible for
> fixing all bugs in that free and open source project, of course.)
>
> > What if we could show information from the bugs in Phabricator in a
> > "tracked" template at other wiki-projects, identifying the team
> > responsible and perhaps even the dev assigned to the bug? Imagine the
> > creds the dev would get when the bug is fixed! Because it is easy to
> > loose track of pages with "tracked" templates we need some other means
> > to show this information, and our "public monitor" could be a special
> > page with the same information.
>
> Feel free to extend https://www.mediawiki.org/wiki/Template:Tracked
>
> > We say we don't want voting over bugs, but by saying that we refuse
> > getting stats over how many users a specific bug hits, and because of
> > that we don't get sufficient information (metrics) to make decisions
> > about specific bugs.
>
> I disagree. Different people see different priorities. Longer version:
> https://www.mediawiki.org/wiki/Bug_management/Development_prioritization
>
> > What if users could give a "this hits me too" from a "tracked"
> > template. That would give a very simple metric on how important it is
> > to fix a problem.
>
> It does not, because software development is not a popularity contest:
> https://www.mediawiki.org/wiki/Bug_management/Development_prioritization
> Voting would create expectations that nobody will fulfill.
>
> Cheers,
> andre
> --
> Andre Klapper | Bugwrangler / Developer Advocate
> https://blogs.gnome.org/aklapper/
>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-14 Thread John Erling Blad
Blame games does not fix faulty processes. You fix a sinkhole by
figuring out where the water comes from and where it goes.

Why do we have bugs that isn't handled for years? Why is it easier to
get a new feature than fixing an old bug?

Google had a problem with unfixed bugs, and they started identifying
the involved developers each time the build was broken. That is pretty
harsh, but what if devs somehow was named when their bugs were
mentioned? What if there were some kind of public statistic? How would
the devs react to being identified with a bug? Would they fix the bug,
or just be mad about it? Devs at some of Googles teams got mad, but in
the end the code were fixed. Take a look at "GTAC 2013 Keynote:
Evolution from Quality Assurance to Test Engineering" [1]

What if we could show information from the bugs in Phabricator in a
"tracked" template at other wiki-projects, identifying the team
responsible and perhaps even the dev assigned to the bug? Imagine the
creds the dev would get when the bug is fixed! Because it is easy to
loose track of pages with "tracked" templates we need some other means
to show this information, and our "public monitor" could be a special
page with the same information.

We say we don't want voting over bugs, but by saying that we refuse
getting stats over how many users a specific bug hits, and because of
that we don't get sufficient information (metrics) to make decisions
about specific bugs. Some bugs (or missing features) although changes
how users are doing specific things, how do we handle that?

What if users could give a "this hits me too" from a "tracked"
template. That would give a very simple metric on how important it is
to fix a problem. To make this visible to the wiki-communities the
special page could be sorted on this metric. Of course the devs would
have completely different priorities, but this page would list the
wiki-communities priorities.

It would be a kind of blame game, but it would also give the devs an
opportunity to get sainthood by fixing annoying bugs.

[1] https://www.youtube.com/watch?v=nyOHJ4GR4iU from 32:20

On Wed, Mar 13, 2019 at 11:49 PM Andre Klapper  wrote:
>
> On Wed, 2019-03-13 at 21:01 +0100, John Erling Blad wrote:
> > This is like an enormous sinkhole, with people standing on the edge,
> > warning about the sinkhole. All around people are saying "we must do
> > something"! Still the sinkhole slowly grows larger and larger. People
> > placing warning signs "Sinkhole ahead". Others notifying neighbors
> > about the growing sinkhole. But nobody does anything about the
> > sinkhole itself.
>
> And repeating the same thing over and over again while repeatedly
> ignoring requests to be more specific won't help either...
>
> andre
> --
> Andre Klapper | Bugwrangler / Developer Advocate
> https://blogs.gnome.org/aklapper/
>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-13 Thread John Erling Blad
This is like an enormous sinkhole, with people standing on the edge,
warning about the sinkhole. All around people are saying "we must do
something"! Still the sinkhole slowly grows larger and larger. People
placing warning signs "Sinkhole ahead". Others notifying neighbors
about the growing sinkhole. But nobody does anything about the
sinkhole itself.

I doubt this will be fixed.

John

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-12 Thread John Erling Blad
What frustrates me the most are

- bugs found by the editor community, that has obvious simple fixes,
which isn't acted upon for several years
- new features that isn't fully tested, and you have to answer in the
community about stuff you rather want to throw out
- new features and changes that are advertised but never implemented

The first one is perhaps the one most easily fixed. I believe WMF
could set up either an official bug squad, or use bug bounties to
speed up fixing of bugs. I tend to believe bug bounties works best,
but it would be really nice to know that bugs are handled in an
orderly fashion by a bug squad.

When introducing new features make a help page at Meta or Mediawiki,
and link to the page from the feature. On that page make a visible
link "Don't panic!" and link to the issue tracker. Don't expect the
users to figure out which extension provides the specific feature,
they don't have a clue. For all important issues make an estimated fix
time, and if no one works on the issue say so. Don't assume the users
understand fancy wording about "need volunteer". Need volunteer for
what? Making coffee?

Some features are described in Phabricator, which is fine, but some of
them has extensive cookie licking which could give someone the
impression that you actually will implement the feature. That often
leads to users asking about the feature, and when it will arrive at
their project. When it does not arrive users gets upset. If you are
working on something, say it, but also be very clear if something has
gone into you personal freezer.


On Tue, Mar 12, 2019 at 9:35 PM Pine W  wrote:
>
> I'm going to make a few points that I think will respond to some comments
> that I've read, and I will try to organize some of my previous points so
> that they're easier to follow.
>
> 1. My impression is that there's agreement that there is a huge backlog.
>
> 2. I think that there's consensus that the backlog is a problem.
>
> 3. I think that we're collectively unsure how to address the backlog, or
> how to analyze the backlog so that everyone has shared situational
> awareness, but we collectively seem to agree that the backlog should be
> addressed.
>
> If any of the above is wrong, please feel free to say so.
>
> Regarding my own opinions only, I personally am frustrated regarding
> multiple issues:
>
> a. that there's been discussion for years about technical debt,
>
> b. that WMF's payroll continues to grow, and while I think that more
> features are getting developed, the backlog seems to be continuing to grow,
>
> c. that WMF, which has the largest budget of any Wikimedia entity, is not
> more transparent regarding how it spends money and what is obtained from
> that spending;
>
> d. that although I think that the Community Liaisons help a lot with
> communications, there remains much room for improvement of communications.
> One of my larger frustrations is that the improvements regarding
> communications have not been more extensive throughout all of WMF.
>
> e. that WMF retains the ability to veto community decisions regarding
> decisions such as deployments of features, but the community has little
> ability to veto WMF decisions. I think that staff as individuals and
> collectively have become more willing to negotiate and yield ground in the
> past few years, which is good, but I remain concerned that these are
> informal rather than formal changes.
>
> f. that I think that some of what WMF does is good and I want to support
> those activities, but there are other actions and inactions of WMF that I
> don't understand or with which I disagree. Conflicts can be time consuming
> and frustrating for me personally, and my guess is that others might feel
> the same, including some people in WMF. I don't know how to solve this. I
> realize that some people might say "Then you should leave", and I regularly
> consider that option, but Wikipedia and the sister projects do a lot of
> good, so I'm torn. This is very much my personal issue and I don't expect
> to discuss it more on Wikitech-l, but it's a factor in how I think about
> this email thread, which is why I mention it.
>
> I hope that this email provides some clarity and is useful for this
> discussion.
>
> Pine
> ( https://meta.wikimedia.org/wiki/User:Pine )
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-12 Thread John Erling Blad
On Tue, Mar 12, 2019 at 10:29 PM Bartosz Dziewoński  wrote:
>
> I get an impression from this thread that the problem is not really the
> size of the backlog, but rather certain individual tasks that sit in
> said backlog rather than being worked on, and which according to John
> are actually major issues.

Sorry, but what I said initially was "bugs with know fixes and
available patches". It could be some "major issues", but I have not
referred to any one, and I'm not sure it is wise to start discussing
specific issues.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-12 Thread John Erling Blad
Without the editors there would be no content, and thus no readers,
and without readers there would be no use for the software provided.
So is the actual users subsidizing the software? Definitely yes! The
content is the primary reason why we have readers. The software is
just a tool to provide the content in an accessible form to the
readers.

Whether an editor is a customer by subsidizing the product directly or
indirectly is not much of a concern, as long as there will be no
subsidizing at all, from any party – ever, without the content.

The primary customer of the software is the editors, but the primary
customer of the content is the readers.

On Tue, Mar 12, 2019 at 2:18 AM David Barratt  wrote:
>
> A customer, by definition (https://en.wikipedia.org/wiki/Customer)
> exchanges something of value (money) for a product or service.
>
> That does not mean that a freemium model (
> https://en.wikipedia.org/wiki/Freemium) is not a valid business model.
> However, if there is no exchange of value, the person consuming the free
> version of the product or service, is not (yet) a customer.
>
> If MediaWiki is the thing we give away for free, what do we charge money
> for?
> Are our customers successfully subsidizing our free (as in beer) software?
>
> On Mon, Mar 11, 2019 at 7:33 PM John Erling Blad  wrote:
>
> > > 2- Everything is open-source and as non-profit, there's always resource
> > > constraint. If it's really important to you, feel free to make a patch
> > and
> > > the team would be always more than happy to review.
> >
> > Wikipedia is the core product, and the users are the primary
> > customers. When a group of core customers request a change, then the
> > service provider should respond. Whether the service provider is a
> > non-profit doesn't really matter, the business model is not part of
> > the functional requirement. The service provider should simply make
> > sure the processes function properly.
> >
> > If the service provider has resource constraints, then it must scale
> > the services until it gets a reasonable balance, but that does not
> > seem to be the case here. It is more like there are no process or the
> > process is defunc.
> >
> > The strange thing is; for many projects the primary customers aren't
> > even part of a stakeholder group, the devs in the groups defines
> > themselves as the "product user group". That tend to skew development
> > from bugs to features. Perhaps that is what happen in general here,
> > too much techies that believe they are the primary customers.
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-11 Thread John Erling Blad
> 2- Everything is open-source and as non-profit, there's always resource
> constraint. If it's really important to you, feel free to make a patch and
> the team would be always more than happy to review.

Wikipedia is the core product, and the users are the primary
customers. When a group of core customers request a change, then the
service provider should respond. Whether the service provider is a
non-profit doesn't really matter, the business model is not part of
the functional requirement. The service provider should simply make
sure the processes function properly.

If the service provider has resource constraints, then it must scale
the services until it gets a reasonable balance, but that does not
seem to be the case here. It is more like there are no process or the
process is defunc.

The strange thing is; for many projects the primary customers aren't
even part of a stakeholder group, the devs in the groups defines
themselves as the "product user group". That tend to skew development
from bugs to features. Perhaps that is what happen in general here,
too much techies that believe they are the primary customers.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-11 Thread John Erling Blad
It seems like some projects simply put everything coming from external
sources into deep freezer or add "need volunteer". If they respond at
all. In some cases it could be that the projects are defunc.

On Mon, Mar 11, 2019 at 9:51 PM Stas Malyshev  wrote:
>
> Hi!
>
> > In my experience WMF teams usually have a way to distinguish "bugs we're
> > going to work on soon" and "bugs we're not planning to work on, but we'd
> > accept patches". This is usually public in Phabricator, but not really
> > documented.
>
> There's the "Need volunteer" tag that I think can be used for that.
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-08 Thread John Erling Blad
> Also should be on the list: Sometimes bugs have a known fix that isn't
> being rolled out, in favour of a larger more fundamental restructuring
> (demanding even more resources).

Yes, I've seen a lot of cookie licking. It makes it hard to solve even
simple bugs.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Question to WMF: Backlog on bugs

2019-03-08 Thread John Erling Blad
The backlog for bugs are pretty large (that is an understatement),
even for bugs with know fixes and available patches. Is there any real
plan to start fixing them? Shall I keep telling the community the bugs
are "tracked"?

/jeblad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] A potential new way to deal with spambots

2019-02-13 Thread John Erling Blad
It is extremely easy to detect a bot unless the bot operator chose to make
it hard. Just make a model for how the user interacts with the input
devices, and do anomaly detection. That imply use of Javascript though, but
users not using JS are either very dubious or quite well-known. There are
nearly no new users that does not use JS.

Reused a previous tex-file, and did not clean it up? "Magnetic Normal Modes
of Bi-Component Permalloy Structures" ;)


On Mon, Feb 11, 2019 at 6:47 PM Aaron Halfaker 
wrote:
>
> We've been working on unflagged bot detection on my team.  It's far from a
> real product integration, but we have shown that it works in practice.  We
> tested this in Wikidata, but I don't see a good reason why a similar
> strategy wouldn't work for English Wikipedia.
>
> Hall, A., Terveen, L., & Halfaker, A. (2018). Bot Detection in Wikidata
> Using Behavioral and Other Informal Cues.
> *Proceedings of the ACM on Human-Computer Interaction*, *2*(CSCW), 64.
pdf
> 
>
> In theory, we could get this into ORES if there was strong demand.  As
Pine
> points out, we'd need to delay some other projects.  For reference, the
> next thing on the backlog that I'm looking at is setting article quality
> prediction for Swedish Wikipedia.
>
> -Aaron
>
> On Mon, Feb 11, 2019 at 11:19 AM Jonathan Morgan 
> wrote:
>
> > This may be naive, but... isn't the wishlist filling this need? And if
not
> > through a consensus-driven method like the wishlist, how should a WMF
team
> > prioritize which power user tools it needs to focus on?
> >
> > Or is just a matter of "Yes, wishlist, but more of it"?
> >
> > - Jonathan
> >
> > On Mon, Feb 11, 2019 at 2:34 AM bawolff  wrote:
> >
> > > Sure its certainly a front we can do better on.
> > >
> > > I don't think Kasada is a product that's appropriate at this time.
> > Ignoring
> > > the ideological aspect of it being non-free software, there's a lot of
> > easy
> > > things we could and should try first.
> > >
> > > However, I'd caution against viewing this as purely a technical
problem.
> > > Wikimedia is not like other websites - we have allowable bots. For
many
> > > commercial websites, the only good bot is a dead bot. Wikimedia has
many
> > > good bots. On enwiki usually they have to be approved, I don't think
> > that's
> > > true on all wikis. We also consider it perfectly ok to do limited
testing
> > > of bots before it is approved. We also encourage the creation of
> > > alternative "clients", which from a server perspective looks like a
bot.
> > > Unlike other websites where anything non-human is evil, here we need
to
> > > ensure our blocking corresponds to social norms of the community. This
> > may
> > > sound not that hard, but I think it complicates botblocking more than
is
> > > obvious at first glance.
> > >
> > > Second, this sort of thing is something that tends to far through the
> > > cracks at WMF. AFAIK the last time there was a team responsible for
admin
> > > tools & anti-abuse was 2013 (
> > > https://www.mediawiki.org/wiki/Admin_tools_development). I believe
> > > (correct
> > > me if I'm wrong) that anti-harrasment team is all about human
harassment
> > > and not anti-abuse in this sense. Security is adjacent to this
problem,
> > but
> > > traditionally has not considered this problem in scope. Even core
tools
> > > like checkuser have been largely ignored by the foundation for many
many
> > > years.
> > >
> > > I guess this is a long winded way of saying - I think there should be
a
> > > team responsible for this sort of stuff at WMF, but there isn't one. I
> > > think there's a lot of rather easy things we can try (Off the top of
my
> > > head: Better captchas. More adaptive rate limits that adjust based on
how
> > > evilish you look, etc), but they definitely require close involvement
> > with
> > > the community to ensure that we do the actual right thing.
> > >
> > > --
> > > Brian
> > > (p.s. Consider this a volunteer hat email)
> > >
> > > On Sun, Feb 10, 2019 at 6:06 AM Pine W  wrote:
> > >
> > > > To clarify the types of unwelcome bots that we have, here are the
ones
> > > that
> > > > I think are most common:
> > > >
> > > > 1) Spambots
> > > >
> > > > 2) Vandalbots
> > > >
> > > > 3) Unauthorized bots which may be intended to act in good faith but
> > which
> > > > may cause problems that could probably have been identified during
> > > standard
> > > > testing in Wikimedia communities which have a relatively well
developed
> > > bot
> > > > approval process. (See
> > > > https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
> > > >
> > > > Maybe unwelcome bots are not a priority for WMF at the moment, in
which
> > > > case I could add this subject into a backlog. I am sorry if I sound
> > > grumpy
> > > > at WMF regarding this subject; this is a problem but I know that
there
> > > are
> > > > millions of problems and I don't expect a different project to be
> > dropped
> > > > 

Re: [Wikitech-l] The mw.ext construct in lua modules

2019-02-05 Thread John Erling Blad
Those that break the naming scheme *somehow* is 7 extensions
(ArticlePlaceholder (mixed case), DynamicPageListEngine (not extension
name), JsonConfig (not extension name), LinkedWiki (not ext
structure), SemanticScribunto (not extension name), Wikibase Client
(not ext structure), ZeroPortal (not ext structure)) of a total of 17.
I have not counted two of my own that will not follow this scheme, and
Capiunto which use require. I have neither included TemplateData.

That is; the naming scheme is followed by approx 40% of the extensions.

There are probably some lua-libs I haven't found.

-- list --
# ArticlePlaceholder
mw.ext.articlePlaceholder

# TitleBlacklist
mw.ext.TitleBlacklist (Only a single method)

# BootstrapCompoinents
mw.bootstrap.*

# Capiunto
Doc says mw.capiunto, but this seems wrong

# Cargo
mw.ext.cargo

# DataTable2
mw.ext.datatable2

# DisplayTitle
mw.ext.displaytitle

# DynamicPageListEngine
mw.ext.dpl

# FlaggedRevs
mw.ext.FlaggedRevs

# Inference
Under development.

# JsonConfig
mw.ext.data.get

# LinkedWiki
mw.linkedwiki

# ParserFunctions
mw.ext.ParserFunctions (Only a single method)

# Pickle
Under development. Uses another loader.

# SemanticScribunto
mw.smw

# TimeConvert
mw.ext.timeconvert

# TitleBlacklist
mw.ext.TitleBlacklist

# VariablesLua
mw.ext.VariablesLua

# Wikibase Client
mw.wikibase

# ZeroPortal
mw.zeroportal

On Tue, Feb 5, 2019 at 5:50 PM Brad Jorsch (Anomie)
 wrote:
>
> On Mon, Feb 4, 2019 at 5:26 PM Eran Rosenthal  wrote:
>
> > > What is the problem with the ".ext" part?
> > 1. It adds unnecessary complexity both in the extension (need to init
> > mw.ext if it doesn't exist)
>
>
> It's one line in the boilerplate. That's not much complexity.
>
>
> > and more important - in its usage when the Lua
> > extension is invoked (longer names)
> >
>
> It's 4 characters. Also not much to be concerned about. You're also free to
> do like
>
> local foo = mw.ext.foo;
>
> if you want shorter access within your code.
>
>
> >(there is very small risk of name collision -  mw.ModuleA and mw.ModuleB
> > are unlikely to clash as different extensions, and mw.ModuleA and mw.FUNC
> > are unlikely to clash because function names
> > <
> > https://www.mediawiki.org/wiki/Extension:Scribunto/Lua_reference_manual#Base_functions
> > >
> > are usually verbs and extensions
> >  are usually
> > nouns)
> >
>
> Scribunto has its own built-in packages too, which are also usually nouns.
> What if, for example, Extension:Math
>  added a Scribunto module at
> "mw.math" and then we also wanted to add a Scribunto-specific version of Lua's
> math library
> ?
> Or Extension:CSS  and a
> Scribunto counterpart to mw.html
> ?
> Or if Extension:UserFunctions
>  did its thing at
> "mw.user" and then we got around to resolving T85419
> ?
>
> Having mw.ext also makes it easier to identify extensions' additions,
> avoiding confusion over whether "mw.foo" is part of Scribunto or comes from
> another extension. And it means you can look in mw.ext to see which
> extensions' additions are available rather than having to filter them out
> of mw.
>
> BTW, we have "mw" in the first place to similarly bundle Scribunto's
> additions away from things that come with standard Lua. If someday standard
> Lua includes its own "ustring" or something else Scribunto adds a module
> for (and we upgrade from Lua 5.1), we won't need to worry about name
> collision there either.
>
>
> > 2. Practically the convention is to not use mw.ext - the convention (based
> > on most of the Lua code - e.g wikibase) is to not use mw.ext
> >
>
> Of extensions in Gerrit (as of a few days ago when I last checked),
> Wikibase and LinkedWiki seem to be the only two extensions not using
> mw.ext, while Cargo, DataTable2, DisplayTitle, DynamicPageListEngine,
> FlaggedRevs, JsonConfig, ParserFunctions, and TitleBlacklist all do.
>
> --
> Brad Jorsch (Anomie)
> Senior Software Engineer
> Wikimedia Foundation
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] The mw.ext construct in lua modules

2019-01-25 Thread John Erling Blad
There are several extensions that diverge on the naming scheme. Some
of them even being referenced as using the scheme, while not providing
lua libs at all. It is a bit weird.

On Fri, Jan 25, 2019 at 7:09 PM Kunal Mehta  wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Hi,
>
> On 1/24/19 11:33 PM, Thiemo Kreuz wrote:
> > Is there a question assigned with this long email? Is this a call
> > for feedback?
>
> I think this is probably related to/coming from
> .
>
> - -- Legoktm
> -BEGIN PGP SIGNATURE-
>
> iQIzBAEBCgAdFiEE2MtZ8F27ngU4xIGd8QX4EBsFJpsFAlxLUJcACgkQ8QX4EBsF
> JpuIGRAAtHXuDQqmJK+fqKiMYrzRE7aXkX/pis7z7F5nncPWfHpaMFKFMHeAu4/d
> PHvpJqifXi5LwCV/YSAugmZJaQ1FFn2u+/ZA9sXAAR0JBvHnY/A5unmfXkzpteEP
> eUSCtexL5vjyjVo+Yd/ixbg06FS9Jc/6dxECxb6/A84gjHHQxA9drK4bkLZRvGPj
> 2oInMsB37iBj5/Q/ShO8Km2hz7HJ/zNyW5ljFTYwTKzNiPBcGdswMLu4vj0ALfIF
> OHwUeHj+M6i5UqnP0HiRBSHeFWo9it6RSXEd+lfVNbn46VJZ3zkNUFDqkWeJOWgs
> o3N781lCdRbcn/P3V+k2CkQhVqjGPb/MgxUyQAreup8fcwBcDiDkj7wNnnUETVuS
> EYg3Fc/xlrjIKYO54LSU5kHphEhCxAHdbxol8X8mNPQ3IHGQpyJCCSX6+qSGM/0+
> CYtNh+ktJSyghmdUv2QOvjSkObTKL2HV9yLD3a/3qqO+Pekn9mnoNax/Splr0bV2
> OkK9KMBEd73+/r+6hmhQoJdESOjLofyzoT9ohR3xWlJSfH8XOAWkphbuu87Dp0k1
> KNjue1eP0KY5bO4+64hnqbCpeVpJiaQjkw+uCTmLz7u7tBME1rt7D+3D0PizXENN
> NNkLc4XNl4ouKti3Yhkx0P4TAy/QIDR15M0eSSikHJI8PehqnRU=
> =V+xr
> -END PGP SIGNATURE-
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] The mw.ext construct in lua modules

2019-01-25 Thread John Erling Blad
Half a century? 50 years? You have been working for WMDE since 2014.
Perhaps it would be an idea to discuss the naming scheme instead of
doing questionable call to authority?

The interesting point is _what_ to gain by adding unrelated character
sequences to names. If some character sequence don't convey any
meaningful or important information, then don't add it. It is only
adding noise. You use naming schemes to avoid name clashes, but if the
context has some inherent properties that block name clashes, then
don't add some random character sequences to minimize a chance for
name clash that is already zero.

What extension could a preloaded library possibly clash with? Such an
extension must be written for Mediawiki, and included in Scribunto,
without having an extension page at mediaiwki.org. Maintained outside
Phabricator/Gerrit, yes, but without an extension page? Does that even
make sense?

I make up my own mind, and if I wanted to quote somebody else I would
have done so.

On Fri, Jan 25, 2019 at 2:39 PM Thiemo Kreuz  wrote:
>
> "How it should be done" according to whom? This might be a dumb
> question, but I had the impression you are speaking for a larger group
> of people in your initial post. I would like to understand the context
> better in which the proposed standard came to be.
>
> Personally, I don't support the idea of an open-for-everything
> "mw.randomStuff" naming scheme. It's half a century that I'm actively
> working with code that contains the sequence "mw." literally thousands
> of times: https://codesearch.wmflabs.org/search/?q=%5Cbmw%5C.%5Cb.
> After all these years my expectation is that stuff is only put
> directly in the "mw." namespace when it is general purpose utility
> stuff. And people are even trying to reduce this.
>
> I understand that "mw.ext." is not terribly different from using "mw."
> directly. Both are places for all kinds of unrelated random stuff. But
> I believe it is still useful to have both: "mw." exclusively for
> random stuff that is part of MediaWiki itself, and a different one for
> community code.
>
> Kind regards
> Thiemo
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] The mw.ext construct in lua modules

2019-01-25 Thread John Erling Blad
It is a description of how it should be done, which is not according
to the current page. Yes it is a call for feedback if I must spell it
out.

On Fri, Jan 25, 2019 at 8:33 AM Thiemo Kreuz  wrote:
>
> Is there a question assigned with this long email? Is this a call for 
> feedback?
>
> Kind regards
> Thiemo
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] The mw.ext construct in lua modules

2019-01-24 Thread John Erling Blad
At the Extension:Scribunto/Lua reference manual, at several places,[1]
it is pointed out that the lua-libs should use the form 'mw.ext.NAME'.
This creates visual noise in the code. Any lib included should have a
extension page, thus it has already been given an unique name. In
addition, only the libraries that need to be preloaded are added to
the mw-structure, and those are the extensions. The ext-addition is
like saying "this is an extension and it is only extensions that needs
to be added to the mw-struct so we make it abundantly clear that this
is an extension".

The only cases where a name can collide is if some external lib is
included, that external lib has the same name as an extension, and if
someone in addition preloads the external lib. The chance is quite
frankly pretty slim, as there are rather few external libs that makes
sense to preload in this environment, especially as preloading imply
some kind of interaction with the environment. That means it is an
extension.

I guess I'm stepping on some toes here…

So to make it abundantly clear, not 'mw.ext.NAME' (or 'mw.ext.NaMe',
or 'mw.ext.name') but 'mw.name' (lowercase, not camelcase). If the
call is a constructor or some kind of builder interface, then
'mw.name(…)' is totally valid. I do not believe it is wise to turn the
lib into an instance by the call, but it can return an instance, it
can cache previously returned instances, and it can somehow install
the instance(s) in the current environment.

An extension should have any pure root libs at 'pure/name.lua' and
additional libs at 'pure/name/additional.lua', where 'pure' is
resolved in the 'ScribuntoExternalLibraries' hook.

[1] 
https://www.mediawiki.org/wiki/Extension:Scribunto/Lua_reference_manual#Extension_libraries_(mw.ext)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Security Notification: Malware creating fake Wikipedia donation banner

2019-01-24 Thread John Bennett
Hello,

In order to keep the community informed of threats against Wikimedia
projects and users, the Wikimedia Security team has some information to
share.

Malware installed via pirated contented downloaded from sites such as the
Pirate Bay can cause web browsers compromised by the malware to create a
fake donation banner for Wikipedia users. While the actual malware is not
installed or distributed via Wikipedia, unaware visitors may be confused or
tricked by it's activities.

The malware seeks to trick visitors to Wikipedia by looking like a
legitimate Wikipedia banner asking for donations. Once the user clicks on
the banner, they are then taken to a portal that leads them to transfer
money to a fraudulent bitcoin account that is not controlled by the
Foundation.

The current version of this malware is only infecting Microsoft Windows
users at the time of this notification. To date, the number of people
affected is small. The fraudulent accounts have taken approximately $700
from infected users. However, we strongly encourage all users to use and
update their antivirus software.


Additional details and a screenshot of the fake donation banner on can be
found at Bleepingcomputer.com. [0]

[0]
https://www.bleepingcomputer.com/news/security/fake-movie-file-infects-pc-to-steal-cryptocurrency-poison-google-results/

Thanks,

John Bennett
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Wikidata] [Wikipedia-l] Fwd: [Wikimedia-l] Wikipedia in an abstract language

2019-01-18 Thread John Erling Blad
Tried a couple of times to rewrite this, but it grows out of bound
anyhow. Seems like it has its own life.

There is a book from 2000 by Robert Dale and Ehud Reiter; Building
natural language generation systems  ISBN 978-0-521-02451-8

Wikibase items can be rebuilt as Plans from the type statement
(top-down) or as Constituents from the other statements (bottom-up).
The two models does not necessarily agree. This is although only the
overall document structure, and organizing of the data, and it leaves
out the really hard part – the language specific realization.

You can probably redefine Plans and Constituents as entities, I have
toyed around with them as Lua classes, and put them into Wikidata. The
easiest way to reuse them locally would be to use a lookup structure
for fully or partly canned text, and define rules for agreement and
inflection as part of these texts. Piecing together canned text is
hard, but easier than building full prose from the bottom. It is
possible to define a very low-level realization for some languages,
but that is a lot harder.

The idea for lookup of canned text is to use the text that covers most
of the available statements, but still such that most of the remaining
statements can also be covered. That is some kind of canned text might
not support a specific agreement rule, thus some other canned text can
not reference it and less coverage is achieved. For example the
direction to the sea can not be expressed in a canned text for Finnish
and then the distance can not reference the direction.

To get around this I prioritized Plans and Constituents, with those
having higher priority being put first. What a person is known for
should go in front of his other work. I ordered the Plans and
Constituents chronologically to maintain causality. This can also be
called sorting. Priority tend to influence plans, and order influence
constituents. Then there are grouping, which keeps some statements
together.  Length, width, height are typically a group.

A lake can be described with individual canned text for length, width,
and height, but those are given low priority. Then it an be made a
canned text for length and height, with somewhat higher priority. An
even higher priority can be given to a canned text for all three.
Given that all three statements are available then the composite
canned text for all of them will be used. If only some of them exist
then a lower priority canned text will be used.

Note that the book use "canned text" a little different.

Also note that the canned texts can be translated as ordinary message
strings. They can also be defined as a kind of entities in Wikidata.
As ordinary message strings they need additional data, but that comes
naturally as entities in Wikidata. My drodling put it inside each
Wikipedia, as it would be easier to reuse from Lua-modules. (And yes,
you can then override part of the ArticlePlaceholder to show the text
at the special page.)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Wikidata] [Wikipedia-l] Fwd: [Wikimedia-l] Wikipedia in an abstract language

2019-01-14 Thread John Erling Blad
An additional note; what Wikipedia urgently needs is a way to create
and reuse canned text (aka "templates"), and a way to adapt that text
to data from Wikidata. That is mostly just inflection rules, but in
some cases it involves grammar rules. To create larger pieces of text
is much harder, especially if the text is supposed to be readable.
Jumbling sentences together as is commonly done by various botscripts
does not work very well, or rather, it does not work at all.

On Mon, Jan 14, 2019 at 11:44 AM John Erling Blad  wrote:
>
> Using an abstract language as an basis for translations have been
> tried before, and is almost as hard as translating between two common
> languages.
>
> There are two really hard problems, it is the implied references and
> the cultural context. An artificial language can get rid of the
> implied references, but it tend to create very weird and unnatural
> expressions. If the cultural context is removed, then it can be
> extremely hard to put it back in, and without any cultural context it
> can be hard to explain anything.
>
> But yes, you can make an abstract language, but it won't give you any
> high quality prose.
>
> On Mon, Jan 14, 2019 at 8:09 AM Felipe Schenone  wrote:
> >
> > This is quite an awesome idea. But thinking about it, wouldn't it be 
> > possible to use structured data in wikidata to generate articles? Can't we 
> > skip the need of learning an abstract language by using wikidata?
> >
> > Also, is there discussion about this idea anywhere in the Wikimedia wikis? 
> > I haven't found any...
> >
> > On Sat, Sep 29, 2018 at 3:44 PM Pine W  wrote:
> >>
> >> Forwarding because this (ambitious!) proposal may be of interest to people
> >> on other lists. I'm not endorsing the proposal at this time, but I'm
> >> curious about it.
> >>
> >> Pine
> >> ( https://meta.wikimedia.org/wiki/User:Pine )
> >>
> >>
> >> -- Forwarded message -
> >> From: Denny Vrandečić 
> >> Date: Sat, Sep 29, 2018 at 6:32 PM
> >> Subject: [Wikimedia-l] Wikipedia in an abstract language
> >> To: Wikimedia Mailing List 
> >>
> >>
> >> Semantic Web languages allow to express ontologies and knowledge bases in a
> >> way meant to be particularly amenable to the Web. Ontologies formalize the
> >> shared understanding of a domain. But the most expressive and widespread
> >> languages that we know of are human natural languages, and the largest
> >> knowledge base we have is the wealth of text written in human languages.
> >>
> >> We looks for a path to bridge the gap between knowledge representation
> >> languages such as OWL and human natural languages such as English. We
> >> propose a project to simultaneously expose that gap, allow to collaborate
> >> on closing it, make progress widely visible, and is highly attractive and
> >> valuable in its own right: a Wikipedia written in an abstract language to
> >> be rendered into any natural language on request. This would make current
> >> Wikipedia editors about 100x more productive, and increase the content of
> >> Wikipedia by 10x. For billions of users this will unlock knowledge they
> >> currently do not have access to.
> >>
> >> My first talk on this topic will be on October 10, 2018, 16:45-17:00, at
> >> the Asilomar in Monterey, CA during the Blue Sky track of ISWC. My second,
> >> longer talk on the topic will be at the DL workshop in Tempe, AZ, October
> >> 27-29. Comments are very welcome as I prepare the slides and the talk.
> >>
> >> Link to the paper: http://simia.net/download/abstractwikipedia.pdf
> >>
> >> Cheers,
> >> Denny
> >> ___
> >> Wikimedia-l mailing list, guidelines at:
> >> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> >> https://meta.wikimedia.org/wiki/Wikimedia-l
> >> New messages to: wikimedi...@lists.wikimedia.org
> >> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> >> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> >> ___
> >> Wikipedia-l mailing list
> >> wikipedi...@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
> >
> > ___
> > Wikidata mailing list
> > wikid...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Wikidata] [Wikipedia-l] Fwd: [Wikimedia-l] Wikipedia in an abstract language

2019-01-14 Thread John Erling Blad
Using an abstract language as an basis for translations have been
tried before, and is almost as hard as translating between two common
languages.

There are two really hard problems, it is the implied references and
the cultural context. An artificial language can get rid of the
implied references, but it tend to create very weird and unnatural
expressions. If the cultural context is removed, then it can be
extremely hard to put it back in, and without any cultural context it
can be hard to explain anything.

But yes, you can make an abstract language, but it won't give you any
high quality prose.

On Mon, Jan 14, 2019 at 8:09 AM Felipe Schenone  wrote:
>
> This is quite an awesome idea. But thinking about it, wouldn't it be possible 
> to use structured data in wikidata to generate articles? Can't we skip the 
> need of learning an abstract language by using wikidata?
>
> Also, is there discussion about this idea anywhere in the Wikimedia wikis? I 
> haven't found any...
>
> On Sat, Sep 29, 2018 at 3:44 PM Pine W  wrote:
>>
>> Forwarding because this (ambitious!) proposal may be of interest to people
>> on other lists. I'm not endorsing the proposal at this time, but I'm
>> curious about it.
>>
>> Pine
>> ( https://meta.wikimedia.org/wiki/User:Pine )
>>
>>
>> -- Forwarded message -
>> From: Denny Vrandečić 
>> Date: Sat, Sep 29, 2018 at 6:32 PM
>> Subject: [Wikimedia-l] Wikipedia in an abstract language
>> To: Wikimedia Mailing List 
>>
>>
>> Semantic Web languages allow to express ontologies and knowledge bases in a
>> way meant to be particularly amenable to the Web. Ontologies formalize the
>> shared understanding of a domain. But the most expressive and widespread
>> languages that we know of are human natural languages, and the largest
>> knowledge base we have is the wealth of text written in human languages.
>>
>> We looks for a path to bridge the gap between knowledge representation
>> languages such as OWL and human natural languages such as English. We
>> propose a project to simultaneously expose that gap, allow to collaborate
>> on closing it, make progress widely visible, and is highly attractive and
>> valuable in its own right: a Wikipedia written in an abstract language to
>> be rendered into any natural language on request. This would make current
>> Wikipedia editors about 100x more productive, and increase the content of
>> Wikipedia by 10x. For billions of users this will unlock knowledge they
>> currently do not have access to.
>>
>> My first talk on this topic will be on October 10, 2018, 16:45-17:00, at
>> the Asilomar in Monterey, CA during the Blue Sky track of ISWC. My second,
>> longer talk on the topic will be at the DL workshop in Tempe, AZ, October
>> 27-29. Comments are very welcome as I prepare the slides and the talk.
>>
>> Link to the paper: http://simia.net/download/abstractwikipedia.pdf
>>
>> Cheers,
>> Denny
>> ___
>> Wikimedia-l mailing list, guidelines at:
>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>> https://meta.wikimedia.org/wiki/Wikimedia-l
>> New messages to: wikimedi...@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> 
>> ___
>> Wikipedia-l mailing list
>> wikipedi...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>
> ___
> Wikidata mailing list
> wikid...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Help: setting a property on page save?

2018-12-04 Thread John
What is your end goal?

On Tue, Dec 4, 2018 at 6:52 PM FreedomFighterSparrow <
freedomfighterspar...@gmail.com> wrote:

> I'm trying to solve the following situation: I need to know, for each
> article,
> the last time it was updated by an actual human. MediaWiki keeps track of
> the
> last update to the page, but doesn't take into account whether it was
> performed
> by a bot or a human.
>
> Instead of querying the revision table every time, I thought of saving a
> page
> property and updating it on page save:
> - Is the editor a human*?
>   - Yes: update the property.
>   - No: Do we already have a last real update date saved?
> - Yes: do nothing (keep the last date)
> - No: find the latest revision by a human and save the property
>
>
> The most logical hook seemed to be 'PageContentSaveComplete' or maybe
> 'PageContentInsertComplete', as I only want to do this if the save actually
> went through (if it is failed by something, we shouldn't update).
> My problem is, I don't seem to have a way to set a property from there...
>
> I really don't want to have to create my own table just for this.
> Any ideas how to solve this? Maybe I'm going about it in a cockamamie way?
>
> Thanks,
> - Dror
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] non-obvious uses of in your language

2018-10-05 Thread John Erling Blad
In my opinion we should try to first process the whole linked phrase by
inflection aka affix rules, and if that fails aka no link target can be
found – then and only then should regexps form prefix and linktrails be
applied. If applying prefix or linktrails creates a word that can be
inflected, and it links to the same target, then move the strings into the
linked phrase. If the link use the pipe-form, then move the strings into
the second part of the link, aka the link text.

Links using the pipe-form should not have the link target inflected. This
is important, as this is the natural escape route if inflection gives wrong
target for whatever reason.

Inflected links should go to the target with the smallest difference. This
is a non-trivial problem. We often link _phrases_ and those could be
processed by several rules, each with some kind of weight rules. An edit
distance would probably not be sufficient.

Perhaps most important; VisualEditor should not insert , if the
users needs this escape route then let them do it themselves in
WikitextEditor.


On Fri, Oct 5, 2018 at 6:17 PM Amir E. Aharoni 
wrote:

> ‫בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת ‪Dan Garry‬‏ <‪
> dga...@wikimedia.org
> ‬‏>:‬
> >
> > On Thu, 4 Oct 2018 at 23:29, John Erling Blad  wrote:
> >
> > > Usually it comes from user errors while using VE. This kind of errors
> are
> > > quite common, and I asked (several years ago) whether it could be fixed
> in
> > > VE, but was told "no".
> > >
> >
> > I'd really appreciate it if you could give me more information on this.
>
> This is very frequent. I know that in the Hebrew Wikipedia it happens up to
> 20 times a day (I actually counted this for many months), and this is never
> intentional or desirable. Never, ever. 100% of cases. The same must be true
> for many other languages, but probably not for all. In wikis bigger than
> the Hebrew Wikipedia it probably happens much more often than 20 times a
> day.
>
> It is possibly the most frequent reason for automatic insertion of 
> tags (although this may be different by language).
>
> How does it happen? Several ways:
> * People add a word ending to an existing link. English has very few word
> endings (-s, -ing, -ed, -able, and not much more), but many other languages
> have more.
> * People highlight only a part of a word when they add a link, even though
> they should have highlighted the whole word.
> * In particular, people highlight the part of the word without an ending.
> For example, "Dogs" is written, and people highlight "Dog".
> * People sometimes actually want to write two separate words and forget to
> write a space. (This may sound silly, but I saw this happening very often.)
> * People write a compound word and link a part of the word. Sometimes it's
> intentional, although as we can see in other emails in this thread not
> everybody agrees about the desirability of this. This works very
> differently in different languages. German has a lot of them, English has
> much less, Hebrew has almost zero.
>
> It's worth running proper user testing
>
> > Here's how the linking feature works right now for adding links to words
> > which presently have no links:
> >
> >- If you put your cursor inside a word without highlighting anything,
> >and add a link, the link is added to the entire word.
> >- If you highlight some text, and add a link, the link is added to the
> >highlighted text.
>
> I know this, and I like how it works, but the fact is that there are many
> other users who don't know this. Simply searching wikitext for
> "]]" will show how often does this happen.
>
> > How would you propose this feature be changed?
>
> One possibility is to not add  after a link. I proposed it, but it
> was declined: https://phabricator.wikimedia.org/T141689 . The declining
> comment links to T128060, which you mentioned in your email, and it's still
> not resolved.
>
> Other than fully stopping to do it, I cannot think of many other
> possibilities. Maybe we could show a warning, although I suspect that many
> users will ignore it or find it unnecessarily intrusive. I'm not a real
> designer, and it's possible that a real designer can come with something
> better.
>
> Another thing we could consider is to link the whole word *by default*, and
> to add another function that separates a link from the trail. I'd further
> suggest the separation be done internally not by "", but by some
> other syntax that looks more semantic, for example "{{#sep}}" (this should
> be a magic word and not a template!). My educated guess is that separating
> the word 

Re: [Wikitech-l] non-obvious uses of in your language

2018-10-05 Thread John Erling Blad
T129778

On Fri, Oct 5, 2018 at 3:59 PM Dan Garry  wrote:

> On Thu, 4 Oct 2018 at 23:29, John Erling Blad  wrote:
>
> > Usually it comes from user errors while using VE. This kind of errors are
> > quite common, and I asked (several years ago) whether it could be fixed
> in
> > VE, but was told "no".
> >
>
> I'd really appreciate it if you could give me more information on this.
> Could you link to the task for this request? There is T128060
> <https://phabricator.wikimedia.org/T128060> from early 2016 ("VisualEditor
> makes it easy to create partially linked words, when the user expects a
> fully linked one") but I don't see you on there, and I want to make sure I
> understand your request.
>
> Here's how the linking feature works right now for adding links to words
> which presently have no links:
>
>- If you put your cursor inside a word without highlighting anything,
>and add a link, the link is added to the entire word.
>- If you highlight some text, and add a link, the link is added to the
>highlighted text.
>
> How would you propose this feature be changed?
>
> Thanks,
> Dan
>
> --
> Dan Garry
> Lead Product Manager, Editing
> Wikimedia Foundation
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] non-obvious uses of in your language

2018-10-04 Thread John Erling Blad
Only thing more dangerous than running a bot on nowiki is running a bot on
dewiki.
Nope, newer touches dewiki.

On Fri, Oct 5, 2018 at 12:49 AM Roul P.  wrote:

> Interesting, today this was topic in the German main forum:
>
> https://de.wikipedia.org/wiki/Wikipedia:Fragen_zur_Wikipedia#Anwendung_von__in_Bildunterschriften
>
> Today there are also more than one user indefinite blocked, which only
> removed  https://de.wikipedia.org/wiki/Benutzer:Entgr%C3%A4ten40
>
> Am Fr., 5. Okt. 2018 um 00:29 Uhr schrieb John Erling Blad <
> jeb...@gmail.com
> >:
>
> > We have the same in Norwegian, but linking on part of a composite is
> almost
> > always wrong. Either you link on the whole composite or no part of the
> > composite. If you link on a part of a composite, then in nearly all
> cases I
> > have seen the link is placed on the wrong term.
> >
> > Some examples on what insanity users write
> > - [[absorpsjon]]s[[Spektrallinje|linjene]]
> > - [[Autentisering]]s[[Protokoll (datamaskiner)|protokollen]]
> > - [[Sykepleie|sykehjem]]s[[Hjemmesykepleie|omsorg]]
> >
> > From an article messed up by VE (yes it does mess up articles sometimes!)
> > - ma[[Øssur Havgrímsson|ge]]e[[Øssur Havgrímsson|evner og]]
> > - og[[Øssur Havgrímsson|i]]t[[Øssur Havgrímsson|det samme]]
> >
> > I have no clue what the previous means…
> >
> > Things like the following is quite common
> > - [[Alexander Kielland]]s
> > - [[De forente nasjoner|FN]]s
> >
> > Usually it comes from user errors while using VE. This kind of errors are
> > quite common, and I asked (several years ago) whether it could be fixed
> in
> > VE, but was told "no".
> >
> > Anyhow I just started a bot to clean up some of the mess…
> >
> >
> > On Thu, Oct 4, 2018 at 6:59 PM Thiemo Kreuz 
> > wrote:
> >
> > > Hey!
> > >
> > > The syntax "[[Schnee]]reichtum" is quite common in the
> > > German community. There are not many other ways to achieve the same:
> > >  or  can be used instead.[1] The later is often the
> > > better alternative, but an auto-replacement is not possible. For
> > > example, "[[Bund]]estag" must become "[[Bund]]estag".
> > >
> > > Not long ago  was often used. This became a problem with the
> > > recent parser updates. All  got replaced with , as far
> > > as I'm aware of.
> > >
> > > > in German, shouldn't they be tweaking the "linktrail" setting on
> > dewiki,
> > > instead of using ``? What are cases where they *do* want the
> > link
> > > to include the entire word?
> > >
> > > The software feature exists because of English [[word ending]]s. The
> > > same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele
> > > [[Arbeit]]en"), but is overshadowed by the fact that German is a
> > > language with many composites. From my experience, the fact that all
> > > linktrails, no matter how long, become part of the link is almost
> > > always a problem. It enlarges the click region, which is good, but
> > > surprises the reader when he ends at an unexpected article. I guess it
> > > would actually be a net-gain when the feature gets turned off or tuned
> > > down in German wikis. For example, we could limit the length of the
> > > linktrail to 2 characters.
> > >
> > > Is somebody interested in creating usage statistics for these
> > > linktrails in the German Wikipedia main namespace?
> > >
> > > Best
> > > Thiemo
> > >
> > > [1]
> > >
> >
> https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6rtern
> > >
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] non-obvious uses of in your language

2018-10-04 Thread John Erling Blad
We have the same in Norwegian, but linking on part of a composite is almost
always wrong. Either you link on the whole composite or no part of the
composite. If you link on a part of a composite, then in nearly all cases I
have seen the link is placed on the wrong term.

Some examples on what insanity users write
- [[absorpsjon]]s[[Spektrallinje|linjene]]
- [[Autentisering]]s[[Protokoll (datamaskiner)|protokollen]]
- [[Sykepleie|sykehjem]]s[[Hjemmesykepleie|omsorg]]

From an article messed up by VE (yes it does mess up articles sometimes!)
- ma[[Øssur Havgrímsson|ge]]e[[Øssur Havgrímsson|evner og]]
- og[[Øssur Havgrímsson|i]]t[[Øssur Havgrímsson|det samme]]

I have no clue what the previous means…

Things like the following is quite common
- [[Alexander Kielland]]s
- [[De forente nasjoner|FN]]s

Usually it comes from user errors while using VE. This kind of errors are
quite common, and I asked (several years ago) whether it could be fixed in
VE, but was told "no".

Anyhow I just started a bot to clean up some of the mess…


On Thu, Oct 4, 2018 at 6:59 PM Thiemo Kreuz 
wrote:

> Hey!
>
> The syntax "[[Schnee]]reichtum" is quite common in the
> German community. There are not many other ways to achieve the same:
>  or  can be used instead.[1] The later is often the
> better alternative, but an auto-replacement is not possible. For
> example, "[[Bund]]estag" must become "[[Bund]]estag".
>
> Not long ago  was often used. This became a problem with the
> recent parser updates. All  got replaced with , as far
> as I'm aware of.
>
> > in German, shouldn't they be tweaking the "linktrail" setting on dewiki,
> instead of using ``? What are cases where they *do* want the link
> to include the entire word?
>
> The software feature exists because of English [[word ending]]s. The
> same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele
> [[Arbeit]]en"), but is overshadowed by the fact that German is a
> language with many composites. From my experience, the fact that all
> linktrails, no matter how long, become part of the link is almost
> always a problem. It enlarges the click region, which is good, but
> surprises the reader when he ends at an unexpected article. I guess it
> would actually be a net-gain when the feature gets turned off or tuned
> down in German wikis. For example, we could limit the length of the
> linktrail to 2 characters.
>
> Is somebody interested in creating usage statistics for these
> linktrails in the German Wikipedia main namespace?
>
> Best
> Thiemo
>
> [1]
> https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6rtern
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] problematic use of "Declined" in Phabricator

2018-10-02 Thread John Erling Blad
*very much agree with both Amir and Brion*

I've seen the same thing; something is reported as a more or less general
issue, it is then picked up as a task, it is further discussed in a
specific context, then closed because it does not fit the given context.
But the new context wasn't part of the reported issue, which is part of the
production system, it is part of some specific on-going development.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Translations on hold until further notice

2018-09-27 Thread John Bennett
Hello,

This update is to add some additional information we are now able to share
in regard to why translation updates are on hold.

The Security team and others at the Wikimedia Foundation are engaged in a
security event involving our translation services.  No Wikimedia users or
their data are currently affected. We made the decision to temporarily
disable translation updates until suitable countermeasures can be applied
and at this point reinstatement of translation updates is to be determined.
At the resolution of this event the Security team will publish a summary
blog post (https://phabricator.wikimedia.org/phame/blog/view/13/) including
additional details as appropriate.

Thank you for your patience and understanding while we work to better
protect the community.

John Bennett

Director of Security, Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Wikidata] Semantic annotation of red links on Wikipedia

2018-09-25 Thread John Erling Blad
This can be done with the special page "AboutTopic" with some additional
logic. It has been discussed at a few projects, but the necessary logic
isn't available. That means the redlink must be created with the q-id, and
there is no well-defined process on how to clean it up afterwards.

At nnwiki the article about Erich Mühsam (
https://nn.wikipedia.org/wiki/Erich_M%C3%BChsam) has a link in the infobox
to Oranienburg (https://nn.wikipedia.org/wiki/Spesial:AboutTopic/Q14808)
which is an article placeholder.

The necessary logic to avoid use of the q-id would be to use some kind of
"item disambiguation"-page instead of going straight to the "about
topic"-page. Still note that in a lot of cases the item can be
automatically disambiguated by simple cluster analysis.

On Tue, Sep 25, 2018 at 3:40 PM 80hnhtv4agou--- via Wikitech-l <
wikitech-l@lists.wikimedia.org> wrote:

>
> All that red makes the page look bad, and i would like to point out the
> abuse factor here, all those red links start edit wars,
>
> and should be put there if any by people,
>
> The creation of the wikidata page also creats a problem, because it does
> not establis a lable which should be mandatory
>
> and in english,
> in the save proses.
>
> and this problem *
> https://www.wikidata.org/wiki/Wikidata:WikiProject_Labels_and_descriptions#List_of_items_without_labels_and/or_descriptions
>
>
>
> >Tuesday, September 25, 2018 2:58 AM -05:00 from Sergey Leschina <
> m...@putnik.ws>:
> >
> >I want to draw your attention to the problem from the other side. On the
> newly created page, which can be opened by the red link, there is no
> binding to the Wikidata. This means that after the creation, the page will
> not automatically be linked to the Wikidata. And if the project has
> templates that can use information from the Wikidata, they will not fully
> work until the page will be saved at least once and linked to an item. I
> already suggested to add the parameter for this:
> https://phabricator.wikimedia.org/T178249
> >
> >If something like this will be implemented, then it will be possible to
> make a template for the red links (with Lua and TemplateStyles) that will
> be connected to the Wikidata. Although I agree that it is better to have a
> syntax that will allow to make links without such difficulties.
> >пн, 24 сент. 2018 г. в 20:50, Maarten Dammers < maar...@mdammers.nl >:
> >>Hi everyone,
> >>
> >>According to  https://www.youtube.com/watch?v=TLuM4E6IE5U : "Semantic
> >>annotation is the process of attaching additional information to various
> >>concepts (e.g. people, things, places, organizations etc) in a given
> >>text or any other content. Unlike classic text annotations for reader's
> >>reference, semantic annotations are used by machines to refer to."
> >>(more at
> >>https://ontotext.com/knowledgehub/fundamentals/semantic-annotation/ )
> >>
> >>On Wikipedia a red link is a link to an article that hasn't been created
> >>(yet) in that language. Often another language does have an article
> >>about the subject or at least we have a Wikidata item about the subject.
> >>Take for example
> >>https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris . It has over
> >>250 incoming links, but the person doesn't have an article in Dutch. We
> >>have a Wikidata item with links to 7 Wikipedia's at
> >>https://www.wikidata.org/wiki/Q116510 , but no way to relate
> >>https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris with
> >>https://www.wikidata.org/wiki/Q116510 .
> >>
> >>Wouldn't it be nice to be able to make a connection between the red link
> >>on Wikipedia and the Wikidata item?
> >>
> >>Let's assume we have this list somewhere. We would be able to offer all
> >>sorts of nice features to our users like:
> >>* Hover of the link to get a hovercard in your favorite backup language
> >>* Generate an article placeholder for the user with basic information in
> >>the local language
> >>* Pre-populate the translate extension so you can translate the article
> >>from another language
> >>(probably plenty of other good uses)
> >>
> >>Where to store this link? I'm not sure about that. On some Wikipedia's
> >>people have tested with local templates around the red links. That's not
> >>structured data, clutters up the Wikitext, it doesn't scale and the
> >>local communities generally don't seem to like the approach. That's not
> >>the way to go. Maybe a better option would be to create a new property
> >>on Wikidata to store the name of the future article. Something like
> >>Q116510: Pxxx -> (nl)"Friedrich Ris". Would be easiest because the
> >>infrastructure is there and you can just build tools on top of it, but
> >>I'm afraid this will cause a lot of noise on items. A couple of
> >>suggestions wouldn't be a problem, but what is keeping people from
> >>adding the suggestion in 100 languages? Or maybe restrict the usage that
> >>a Wikipedia must have at least 1 (or n) incoming links before people are
> >>allowed to add it?
> >>We could 

Re: [Wikitech-l] My Phabricator account has been disabled

2018-08-08 Thread John
Alex, honestly as a passive observer I have seen CoC issues used as a
sledge hammer to force ideas thru and to shut down open civil discussions
and disagreements.

On Wed, Aug 8, 2018 at 9:29 AM Alex Monk  wrote:

> Are you trying to ban people discussing CoC committee decisions publicly?
> Not that it even looks like he wrote grievances.
>
> On Wed, 8 Aug 2018, 14:23 Dan Garry,  wrote:
>
> > On 8 August 2018 at 13:53, MZMcBride  wrote:
> > >
> > > Ah, I found the e-mail: […]
> > >
> >
> > This mailing list is not an appropriate forum for airing your grievances
> > with the way the Code of Conduct Committee has handled this matter.
> >
> > Dan
> >
> > --
> > Dan Garry
> > Lead Product Manager, Editing
> > Wikimedia Foundation
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] My Phabricator account has been disabled

2018-08-08 Thread John
Why shouldn’t users be able to A) find out why their account was disabled?
(Original email list in clutter) 2) something as simple as WTF isn’t a
reasonable bannable offense. It wasn’t calling someone an F. If the CoC
Committee is afraid of having their actions brought to life in a public
discussion, odds are their actions are not acceptable.

On Wed, Aug 8, 2018 at 9:23 AM Dan Garry  wrote:

> On 8 August 2018 at 13:53, MZMcBride  wrote:
> >
> > Ah, I found the e-mail: […]
> >
>
> This mailing list is not an appropriate forum for airing your grievances
> with the way the Code of Conduct Committee has handled this matter.
>
> Dan
>
> --
> Dan Garry
> Lead Product Manager, Editing
> Wikimedia Foundation
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can/should my extensions be deleted from the Wikimedia Git repository?

2018-06-08 Thread John
> Where? So far it's been a few individuals.


Here, here. Can you please cite the clear community decision you are
referencing? Just because a few users took unilaterally actions and most
people didn't object, that isn't

consensus.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can/should my extensions be deleted from the Wikimedia Git repository?

2018-06-07 Thread John
*It's a reasonable ask to have the file there *Correct, its reasonable to
ask. Forcing it down peoples throats and cluttering 830+ repos with the
same file is not. Why not have it in the primary mediawiki directory and
note that it covers all sub-projects? Threatening users and telling users
that disagrees with your position about a file requirement not in the CoC
is flat out intimidation. Instead of saying *Maybe this should be brought
up for discussion *users are now defending and threatening users who
questioned them. Just leaves a sour taste in my mouth. Feel free to
continue to personally attack those who you disagree with, instead of the
subject mater. Whatever


On Thu, Jun 7, 2018 at 8:48 PM, Ryan Lane  wrote:

> The most likely way for people to see codes of conduct is through
> repositories, which lets them know they have some way to combat harassment
> in the tool they're using to try to contribute to a particular repository.
> It makes sense to have a CODE_OF_CONDUCT.md in the repos; however, if all
> the repos are using the same policy, it's often better to have a minimal
> CODE_OF_CONDUCT.md that simply says "This repo is governed by the blah blah
> code of conduct, specified here: ". This makes it possible to have a
> single boilerplate code of conduct without needing to update every repo
> whenever the CoC changes.
>
> It's a reasonable ask to have the file there, and this discussion feels
> like a thinly veiled argument against CoCs as a whole. If you're so against
> the md file, or against the CoC as a whole, github and/or gitlab are fine
> places to host a repository.
>
> On Thu, Jun 7, 2018 at 5:39 PM, John  wrote:
>
> > Honestly I find forcing documentation into repos to be abrasive, and
> > overstepping the bounds of the CoC.I also find the behavior of those
> > pushing such an approach to be hostile and overly aggressive. Why do you
> > need to force a copy of the CoC into every repo? Why not keep it in a
> > central location? What kind of mess would you need to cleanup if for some
> > reason you needed to adjust the contents of that file? Instead of having
> > one location to update you now have 800+ copies that need fixed.
> >
> > On Thu, Jun 7, 2018 at 8:23 PM, Yaron Koren  wrote:
> >
> > >  Chris Koerner  wrote:
> > > > “Please just assume for the sake of this discussion that (a) I'm
> > willing
> > > > to abide by the rules of the Code of Conduct, and (b) I don't want
> the
> > > > CODE_OF_CONDUCT.md file in my extensions.”
> > > > Ok, hear me out here. What if I told you those two things are
> > > > incompatible? That abiding by the community agreements requires the
> > file
> > > > as an explicit declaration of said agreement. That is to say, if we
> had
> > > > a discussion about amending the CoC to be explicit about this
> > expectation
> > > > you wouldn’t have issues with including it? Or at least you’d be OK
> > with
> > > > it?
> > >
> > > Brian is right that adding a requirement to include this file to the
> CoC
> > > would be an odd move. But, if it did happen, I don't know - I suppose
> I'd
> > > have two choices: either include the files or remove my code. I would
> be
> > an
> > > improvement over the current situation in at least one way: we would
> know
> > > that rules are still created in an orderly, consensus-like way, as
> > opposed
> > > to now, where a small group of developers can apparently make up rules
> as
> > > they go along.
> > >
> > > -Yaron
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can/should my extensions be deleted from the Wikimedia Git repository?

2018-06-07 Thread John
Honestly I find forcing documentation into repos to be abrasive, and
overstepping the bounds of the CoC.I also find the behavior of those
pushing such an approach to be hostile and overly aggressive. Why do you
need to force a copy of the CoC into every repo? Why not keep it in a
central location? What kind of mess would you need to cleanup if for some
reason you needed to adjust the contents of that file? Instead of having
one location to update you now have 800+ copies that need fixed.

On Thu, Jun 7, 2018 at 8:23 PM, Yaron Koren  wrote:

>  Chris Koerner  wrote:
> > “Please just assume for the sake of this discussion that (a) I'm willing
> > to abide by the rules of the Code of Conduct, and (b) I don't want the
> > CODE_OF_CONDUCT.md file in my extensions.”
> > Ok, hear me out here. What if I told you those two things are
> > incompatible? That abiding by the community agreements requires the file
> > as an explicit declaration of said agreement. That is to say, if we had
> > a discussion about amending the CoC to be explicit about this expectation
> > you wouldn’t have issues with including it? Or at least you’d be OK with
> > it?
>
> Brian is right that adding a requirement to include this file to the CoC
> would be an odd move. But, if it did happen, I don't know - I suppose I'd
> have two choices: either include the files or remove my code. I would be an
> improvement over the current situation in at least one way: we would know
> that rules are still created in an orderly, consensus-like way, as opposed
> to now, where a small group of developers can apparently make up rules as
> they go along.
>
> -Yaron
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Recent Account hijacking activities

2018-05-16 Thread John Bennett
*On 8 May 2018, account hijacking activities were discovered on Wikiviajes
- Spanish Wikivoyage (es.wikivoyage.org <http://es.wikivoyage.org>). It was
identified by community stewards and communicated to the Trust and Safety,
Legal, and Security teams who responded to the event.  At this time the
event is still under investigation and we are unable to share more about
what is being done without risking additional hijacking of accounts.
However, we feel it is important to share what details we can and inform
the community of what happened.  Similar to past security incidents, we
continue to encourage everyone to take some routine steps to maintain a
secure computer and account - including regularly changing your passwords,
actively running antivirus software on your systems, and keeping your
system software up to date. The Wikimedia Foundation's Security team and
others are investigating this incident as well as potential improvements to
prevent future incidents. We are also working with our colleagues in other
departments to develop plans for how to best share future status updates on
each of these incidents. However, we are currently focused on resolving the
issues identified. If you have any questions, please contact the Trust and
Safety team (ca{{@}}wikimedia.org <http://wikimedia.org>). John
BennettDirector of Security, Wikimedia Foundation*
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Map internationalization launched everywhere, AND embedded maps now live on 276 Wikipedias

2018-05-09 Thread John D.
test

From: Joe Matazzoni 
Sent: Wednesday, May 9, 2018 5:32 PM
To: wikitech-l@lists.wikimedia.org 
Subject: [Wikitech-l] Map internationalization launched everywhere,AND embedded 
maps now live on 276 Wikipedias

As of today, interactive (Kartographer) maps no longer display in the language 
of the territory mapped; instead, you’ll read them in the content language of 
the wiki where they appear—or in the language their authors specify (subject to 
availability of multilingual data). In addition, mapframe, the feature that 
automatically embeds dynamic maps right on a wiki page, is now live on most 
Wikipedias that lacked the feature. (Not included in the mapframe launch are 
nine Wikipedias [1] that use the stricter version of Flagged Revisions).

If you you’re new to mapframe, this Kartographer help page [2] shows how to get 
started putting dynamic maps on your pages.  If you’d like to read more about 
map internationalization: this Special Update [3] explains the feature and its 
limiations; this post [4] and this one [5] describe the uses of the new 
parameter, lang=”xx”, which  lets you specify a map’s language. And here are 
some example maps [6] to illustrate the new capabilities. 

These features could not have been created without the generous programming 
contributions and advice of our many map-loving volunteers, including Yurik, 
Framawiki, Naveenpf, TheDJ, Milu92, Astirlin, Evad37, Pigsonthewing, Mike Peel, 
Eran Roz,  Gareth and Abbe98. My apologies to anyone I’ve missed. 

The Map Improvements 2018 [7] project wraps up at the end of June, so please 
give internationalized maps and mapframe a try soon and give us your feedback 
on the project talk page [8]. We’re listening. 

[1] https://phabricator.wikimedia.org/T191583
[2] https://www.mediawiki.org/wiki/Help:Extension:Kartographer 
[3] 
https://www.mediawiki.org/wiki/Map_improvements_2018#April_18,_2018,_Special_Update_on_Map_Internationalization
[4] 
https://www.mediawiki.org/wiki/Map_improvements_2018#April_25,_2018,_You_can_now_try_out_internationalization_(on_testwiki)
[5] 
https://www.mediawiki.org/wiki/Map_improvements_2018#April_26,_2018:_OSM_name_data_quirks_and_the_uses_of_lang=%E2%80%9Clocal%E2%80%9D
 
[6] https://test2.wikipedia.org/wiki/Map_internationalization_examples 
[7] https://www.mediawiki.org/wiki/Map_improvements_2018
[8] https://www.mediawiki.org/wiki/Talk:Map_improvements_2018 
_

Joe Matazzoni 
Product Manager, Collaboration
Wikimedia Foundation, San Francisco

"Imagine a world in which every single human being can freely share in the sum 
of all knowledge." 




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] OOUI v0.27.0 release (breaking change)

2018-05-09 Thread John D.
test

From: Volker E. 
Sent: Wednesday, May 9, 2018 2:58 PM
To: Wikimedia developers 
Cc: Design.Public 
Subject: [Wikitech-l] OOUI v0.27.0 release (breaking change)

Hello everyone,

We've released OOUI version 0.27.0 last evening. It will be in
MediaWiki core from 1.32.0-wmf.3, which will be deployed to Wikimedia
production in the regular train, starting on Tuesday 16 May. As there
are five breaking changes in this release, at least nominally, please
carefully consider if they affect your code.


Breaking changes since last release:

* GroupElement: Remove getItem(s)FromData (Prateek Saxena)
We rename a number of getters when asking for one of 0+ items for
consistency reasons.
This and the following two changes are all on this topic.

* MultiSelectWidget: Remove getSelectedItems and getSelectedItemsData
(Prateek Saxena)

* SelectWidget: Remove getSelectedItem (Prateek Saxena)

* TagItemWidget: Replace 'disabled' items with 'fixed' (Moriel Schottlender)
For consistency of behavior and separation of UX concerns, we don't
allow for individual tag items in a TagMultiselectWidget to be
individually disabled. Instead, allow defining 'fixed' values that
cannot be removed. We are not aware of any instance of the
TagMultiselectWidget featuring 'disabled' items, nevertheless marking
it as breaking change.

* indicators: Remove 'alert', deprecated in v0.25.2 (James D. Forrester)
'alert' indicator has been deprecated a while ago and is unused in all
known implementations, therefore we've removed it completely.


Deprecations since last release:

* icons: Deprecate 'editing-citation' icons from 'content' (Volker E.)
We've added an explicit 'editing-citation' icon pack, that helps us
standardize citation icons. Therefore we have moved existing related
icons to its own pack.

* icons: Rename 'settings' to 'pageSettings' (Volker E.)
Clarify intended usage of icon by renaming it.

Please update your icon pack references accordingly in case you're
using one of those icons.


New features, icons and highlights in this release:

* An infusable PHP NumberInputWidget was implemented by volunteer mainframe98.

* icons: 'editing-citation' pack was added (Volker E.)

* ProcessDialog: Fix footer height when actions or dialog size changes
(Bartosz Dziewoński)


Additional details on 45 code-level and accessibility changes, 34
styling and interaction design amendments, and all improvements since
v0.26.0 are in the full changelog[0]. If you have any further queries
or need help dealing with breaking changes, please let me know.

As always, library documentation is available on mediawiki.org[1], and
there is some comprehensive generated code-level documentation and
interactive demos hosted on doc.wikimedia.org[2].

Thanks to all contributors involved, especially volunteers mainframe98
and Daimona Eaytoy!

Best,
Volker

[0] - https://phabricator.wikimedia.org/diffusion/GOJU/browse/master/History.md
[1] - https://www.mediawiki.org/wiki/OOUI
[2] - https://doc.wikimedia.org/oojs-ui/master/

--
Senior User Experience Engineer
Wikimedia Foundation

volke...@wikimedia.org | @Volker_E

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] test

2018-04-25 Thread John D.
test
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] HELP info box's

2018-04-22 Thread John D.
data from wikidata will stretch a info box

how do you drop down lines not  because the text is not in the info box.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Incoming and outgoing links enquiry

2018-03-18 Thread John
I would second the recommendation of using the dumps for such a large
graphing project. If it's more than a couple hundred pages the API/database
queries can get bulky

On Sun, Mar 18, 2018 at 5:07 PM Brian Wolff  wrote:

> Hi,
>
> You can run longer queries by getting access to toolforge (
> https://wikitech.wikimedia.org/wiki/Portal:Toolforge) and running from the
> command line.
>
> However the query in question might  still take an excessively long time
> (if you are doing all of wikipedia). I would expect that query to result in
> about 150mb of data and maybe take days to complete.
>
> You can also break it down into parts by adding WHERE page_title >='a' AND
> page_title < 'b'
>
> Note, also of interest: full dumps of all the links is available at
>
> https://dumps.wikimedia.org/enwiki/20180301/enwiki-20180301-pagelinks.sql.gz
> (you would also need
> https://dumps.wikimedia.org/enwiki/20180301/enwiki-20180301-page.sql.gz to
> convert page ids to page names)
> --
> Brian
> On Sunday, March 18, 2018, Nick Bell  wrote:
> > Hi there,
> >
> > I'm a final year Mathematics student at the University of Bristol, and
> I'm
> > studying Wikipedia as a graph for my project.
> >
> > I'd like to get data regarding the number of outgoing links on each page,
> > and the number of pages with links to each page. I have already
> > inquired about this with the Analytics Team mailing list, who gave me a
> few
> > suggestions.
> >
> > One of these was to run the code at this link
> https://quarry.wmflabs.org/
> > query/25400
> > with these instructions:
> >
> > "You will have to fork it and remove the "LIMIT 10" to get it to run on
> > all the English Wikipedia articles. It may take too long or produce
> > too much data, in which case please ask on this list for someone who
> > can run it for you."
> >
> > I ran the code as instructed, but the query was killed as it took longer
> > than 30 minutes to run. I asked if anyone on the mailing list could run
> it
> > for me, but no one replied saying they could. The guy who wrote the code
> > suggested I try this mailing list to see if anyone can help.
> >
> > I'm a beginner in programming and coding etc., so any and all help you
> can
> > give me would be greatly appreciated.
> >
> > Many thanks,
> > Nick Bell
> > University of Bristol
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Persian Wikimedia cryptocurrency mining incident

2018-03-14 Thread John Bennett
*On 14 March 2018, evidence of cryptocurrency mining software was
discovered on Persian Wikipedia. It was identified by the community and
removed within 10 minutes of being added to the site. Additionally, the
rights of the user responsible have been revoked and their account has been
globally locked. At this time there is no evidence of any user's computer
or account being compromised or otherwise affected. However, we encourage
everyone to take some routine steps to maintain a secure computer and
account - including regularly changing your passwords, actively running
antivirus software on your systems, and keeping your system software up to
date. The Wikimedia Foundation's Security team is investigating this
incident as well as potential improvements to prevent future incidents. If
you have any questions, please contact the Security team
(security-team{{@}}wikimedia.org <http://wikimedia.org/>). Apologies for
only posting in English, translating and reposting in Fārsi would be
greatly appreciated.Thanks,John Bennett*
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikistats 2.0 - Now with Maps!

2018-02-26 Thread John Erling Blad
I guess this is pretty obvious, but when you create numbers for something
generated by an actor (as something that makes the activation) within that
area, those numbers should be normalized against the number of actors.
There are a whole lot of articles being read in Norwegian from China, does
that mean Chinese people are darn good at reading Norwegian? Or does it
mean that there are a whole lot of Chinese people? I'm pretty sure the
later is more correct than the former.

This kind of errors are pretty common. At Statistics Norway (been there,
done that, etc…) they used an example about kindergarten where statistics
had been used by a politician as an example of a municipality with
especially good services, it had the highest amount of kindergarten in the
whole country. The problem was, the number of children was also of the
highest, and the probability of a child to actually get a place in
kindergarten was pretty low.

So please normalize the numbers! =)

On Wed, Feb 14, 2018 at 11:15 PM, Nuria Ruiz  wrote:

> Hello from Analytics team:
>
> Just a brief note to announce that Wikistats 2.0 includes data about
> pageviews per project per country for the current month.
>
> Take a look, pageviews for Spanish Wikipedia this current month:
> https://stats.wikimedia.org/v2/#/es.wikipedia.org/reading/
> pageviews-by-country
>
> Data is also available programatically vi APIs:
>
> https://wikitech.wikimedia.org/wiki/Analytics/AQS/
> Pageviews#Pageviews_split_by_country
>
> We will be deploying small UI tweaks during this week but please explore
> and let us know what you think.
>
> Thanks,
>
> Nuria
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unknown pages in watchlist

2018-01-29 Thread John
Check the move logs

On Mon, Jan 29, 2018 at 6:19 AM יגאל חיטרון  wrote:

> Hello. I have a watching issue. Writing here and not in phabricator,
> because it's absolutely non reproduceable. So if somebody is interested -
> here are the details.
> I found now an article in my watchlist that I never added to it. I checked
> and saw more articles like this. Including the first one, 12 articles I
> never added, never edited and never opened. All of them talk about the same
> city. I never was there and never read aboit it in wiki. After some more
> checks I saw that all of them are in the same category, about some small
> neigbourhood in this city. I never heard its name. The category includes 13
> articles. The redundant one was created a couple of weeks ago. The category
> itself is not in the watchlist. And I don't thing I just don't remember,
> because I do not work on this wiki so much, so there are 61 articles in my
> watchlist there, 20% of them problematic. It's a murricle.
> Igal (User:IKhitron)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Potential 1 day hackathon in London on February 3

2018-01-17 Thread John Lubbock
Here is the event page for the talk in the evening following the hackathon:
https://www.eventbrite.co.uk/e/the-future-of-wikipedia-tickets-42271530285

<http://www.avg.com/email-signature?utm_medium=email_source=link_campaign=sig-email_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email_source=link_campaign=sig-email_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

John Lubbock

Communications Coordinator

Wikimedia UK

+44 (0) 203 372 0767



Wikimedia UK is a Company Limited by Guarantee registered in England and
Wales, Registered No. 6741827. Registered Charity No.1144513. Office 1,
Ground Floor, Europoint, 5 - 11 Lavington Street, London SE1 0NZ.

Wikimedia UK is the UK chapter of a global Wikimedia movement. The
Wikimedia projects are run by the Wikimedia Foundation (who operate
Wikipedia, amongst other projects). *Wikimedia UK is an independent
non-profit charity with no legal control over Wikipedia nor responsibility
for its contents.*

On 17 January 2018 at 10:57, John Lubbock <john.lubb...@wikimedia.org.uk>
wrote:

> Hello developers,
>
> I would like to gauge the community's interest in taking part in a 1 day
> hackathon in London on February 3. It's a bit short notice, but we have
> been offered 1 day during a more general hackathon on the future of Wikis.
> It's going under the general title of Darvoz (Portugese for 'to give
> voice') and Katherine Maher will be in London during that evening and will
> be giving a talk at the same venue from 7-9.
>
> So I would like to see how many developers would be interested to come
> down and join in a hackathon and attend the talk in the evening. At the
> moment, it's undecided what the focus of the hackathon would be, but that's
> why I would like to see if any community members could come and take part,
> and potentially help me to organise the day. Please let me know if you are
> interested and if you could come to London (if you might need somewhere to
> stay the night, or if you might need travel expenses to get there) for that
> day.
>
> Here's the darvoz.org site.
>
> John Lubbock
>
> Communications Coordinator
>
> Wikimedia UK
>
> +44 (0) 203 372 0767 <+44%2020%203372%200767>
>
>
>
> Wikimedia UK is a Company Limited by Guarantee registered in England and
> Wales, Registered No. 6741827. Registered Charity No.1144513. Office 1,
> Ground Floor, Europoint, 5 - 11 Lavington Street, London SE1 0NZ.
>
> Wikimedia UK is the UK chapter of a global Wikimedia movement. The
> Wikimedia projects are run by the Wikimedia Foundation (who operate
> Wikipedia, amongst other projects). *Wikimedia UK is an independent
> non-profit charity with no legal control over Wikipedia nor responsibility
> for its contents.*
>
>
> <http://www.avg.com/email-signature?utm_medium=email_source=link_campaign=sig-email_content=webmail>
>  Virus-free.
> www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email_source=link_campaign=sig-email_content=webmail>
> <#m_-4398641619843922127_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Potential 1 day hackathon in London on February 3

2018-01-17 Thread John Lubbock
Hello developers,

I would like to gauge the community's interest in taking part in a 1 day
hackathon in London on February 3. It's a bit short notice, but we have
been offered 1 day during a more general hackathon on the future of Wikis.
It's going under the general title of Darvoz (Portugese for 'to give
voice') and Katherine Maher will be in London during that evening and will
be giving a talk at the same venue from 7-9.

So I would like to see how many developers would be interested to come down
and join in a hackathon and attend the talk in the evening. At the moment,
it's undecided what the focus of the hackathon would be, but that's why I
would like to see if any community members could come and take part, and
potentially help me to organise the day. Please let me know if you are
interested and if you could come to London (if you might need somewhere to
stay the night, or if you might need travel expenses to get there) for that
day.

Here's the darvoz.org site.

John Lubbock

Communications Coordinator

Wikimedia UK

+44 (0) 203 372 0767



Wikimedia UK is a Company Limited by Guarantee registered in England and
Wales, Registered No. 6741827. Registered Charity No.1144513. Office 1,
Ground Floor, Europoint, 5 - 11 Lavington Street, London SE1 0NZ.

Wikimedia UK is the UK chapter of a global Wikimedia movement. The
Wikimedia projects are run by the Wikimedia Foundation (who operate
Wikipedia, amongst other projects). *Wikimedia UK is an independent
non-profit charity with no legal control over Wikipedia nor responsibility
for its contents.*

<http://www.avg.com/email-signature?utm_medium=email_source=link_campaign=sig-email_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email_source=link_campaign=sig-email_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

2017-12-06 Thread John Erling Blad
What is the current state, will some kind of digest be retained?

On Thu, Sep 21, 2017 at 9:56 PM, Gergo Tisza  wrote:

> On Thu, Sep 21, 2017 at 6:10 AM, Daniel Kinzler <
> daniel.kinz...@wikimedia.de
> > wrote:
>
> > Yes, we could put it into a separate table. But that table would be
> > exactly as
> > tall as the content table, and would be keyed to it. I see no advantage.
>
>
> The advantage is that MediaWiki almost would never need to use the hash
> table. It would need to add the hash for a new revision there, but table
> size is not much of an issue on INSERT; other than that, only slow
> operations like export and API requests which explicitly ask for the hash
> would need to join on that table.
> Or this primarily a disk space concern?
>
> > Also, since content is supposed to be deduplicated (so two revisions with
> > > the exact same content will have the same content_address), cannot that
> > > replace content_sha1 for revert detection purposes?
> >
> > Only if we could detect and track "manual" reverts. And the only reliable
> > way to
> > do this right now is by looking at the sha1.
>
>
> The content table points to a blob store which is content-addressible and
> has its own deduplication mechanism, right? So you just send it the content
> to store, and get an address back, and in the case of a manual revert, that
> address will be one that has already been used in other content rows. Or do
> you need to detect the revert before saving it?
>
> SHA1 is not that slow.
> >
>
> For the API/Special:Export definitely not. Maybe for generating the
> official dump files it might be significant? A single sha1 operation on a
> modern CPU should not take more than a microsecond: there are a few hundred
> operations in a decently implemented sha1 and processors are in the GHz
> range. PHP benchmarks [1] also give similar values. With the 64-byte block
> size, that's something like 5 hours/TB - not sure how that compares to the
> dump process itself (also it's probably running on lots of cores in
> parallel).
>
>
> [1] http://www.spudsdesign.com/benchmark/index.php?t=hash1
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Thoughts for handy features

2017-11-16 Thread John Elliot V
Hey there.

I love MediaWiki and have been using it everywhere for years. Recently
I've been doing some rather major documentation and I realised there
were three features which would be really handy for me (if these or
similar already exist I would love to know!):

1. For links to articles in sections on the same page it would be really
handy if we had syntax like: [[#unity|]] which would auto-complete to
[[#unity|unity]] for you.

2. For duplicated content it would be handy if you could define a bunch
of "variables" down the bottom of a page and then reference them from
elsewhere. I am aware of templates, but those are overkill and difficult
to maintain per my use case (my use case is documenting the "purpose" of
a computer, I duplicate this in various places, but don't want to
maintain templates for that).

3. It would be cool if for any given wiki page an "estimated reading
time" could be provided. Along with maybe a word count, character count,
etc.

Since I'm here, quick thanks to the MediaWiki community for creating
such wonderful wiki software!

Regards,
John Elliot V

-- 
E: j...@jj5.net
P: +61 4 3505 7839
W: https://www.jj5.net/
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Tech talk: Selenium tests in Node.js

2017-10-30 Thread John
Željko, Im tempted to throw a {{trout}} at you. Please never repeat what
you did in this post and talk via emoji. We do not write in hieroglyphics.

On Mon, Oct 30, 2017 at 8:33 PM, Rachel Farrand 
wrote:

> A reminder with streaming details for Tuesday's tech talk:
>
> *Tech Talk**:* Selenium tests in Node.js
> *Presenter:* Zeljko Filipin
> *Date:* October 31, 2017
> *Time: *16:00 UTC
>  Tech+Talk+%3A+Selenium+tests+in+Node.js=20171031T16=1440=1>
>
> *Length:* 30 minutes
> Link to live YouTube stream 
> *IRC channel for questions/discussion:* #wikimedia-office
>
> *Summary: *
>
> Selenium tests in Node.js. We will write a new simple test for a MediaWiki
> extension. An example: https://www.mediawiki.org/
> wiki/Selenium/Node.js/Write
>
> *Feel free to forward this email to any other relevant wikimedia lists.*
>
> On Thu, Oct 26, 2017 at 10:59 AM, Željko Filipin 
> wrote:
>
> > # Who ‍
> >
> > Željko Filipin, Engineer (Contractor) from Release Engineering team.
> > That's me! 
> >
> > # What 
> >
> > Selenium tests in Node.js. We will write a new simple test for a
> MediaWiki
> > extension. An example: https://www.mediawiki.org/
> > wiki/Selenium/Node.js/Write
> >
> > # When ⏳
> >
> > Tuesday, October 31, 16:00 UTC
> >
> > # Where 
> >
> > The internet! The event will be streamed and recorded. Details coming
> soon.
> >
> > # Why 
> >
> > We are deprecating Ruby Selenium framework:
> https://phabricator.wikimedia.
> > org/T173488
> >
> > See you there!
> >
> > Željko Filipin
> >
> > ___
> > Engineering mailing list
> > engineer...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/engineering
> >
> >
>
>
> --
> Rachel Farrand
> Events Program Manager
> Technical Collaboration Team
> Wikimedia Foundation
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

2017-09-19 Thread John Erling Blad
There are two important use cases; one where you want to identify previous
reverts, and one where you want to identify close matches. There are other
ways to do the first than to use a digest, but the digest opens up for
alternate client side algorithms. The last would typically be done by some
locally sensitive hashing. In both cases you don't want to download the
content of each revision, that is exactly why you want to use some kind of
hashes. If the hashes could be requested somehow, perhaps as part of the
API, then it should be sufficient. Those hashes could be part of the XML
dump too, but if you have the XML-dump and know the algorithm, then you
don't need the digest.

There are a specific use case when someone want to verify the content. In
those cases you don't want to identify a previous revert, you want to check
whether someone has tempered with the downloaded content. As you don't know
who might have tempered with the content you should also question the
digest delivered by WMF, thus the digest in the database isn't good enough
as it is right now. Instead of a sha-digest each revision should be
properly signed, but then if you can't trust WMF can you trust their
signature? Signatures for revisions should probably be delivered by some
external entity and not WMF itselves.

On Fri, Sep 15, 2017 at 11:44 PM, Daniel Kinzler <
daniel.kinz...@wikimedia.de> wrote:

> A revert restores a previous revision. It covers all slots.
>
> The fact that reverts, watching, protecting, etc still works per page,
> while you
> can have multiple kinds of different content on the page, is indeed the
> point of
> MCR.
>
> Am 15.09.2017 um 22:23 schrieb C. Scott Ananian:
> > Alternatively, perhaps "hash" could be an optional part of an MCR chunk?
> > We could keep it for the wikitext, but drop the hash for the metadata,
> and
> > drop any support for a "combined" hash over wikitext + all-other-pieces.
> >
> > ...which begs the question about how reverts work in MCR.  Is it just the
> > wikitext which is reverted, or do categories and other metadata revert as
> > well?  And perhaps we can just mark these at revert time instead of
> trying
> > to reconstruct it after the fact?
> >  --scott
> >
> > On Fri, Sep 15, 2017 at 4:13 PM, Stas Malyshev 
> > wrote:
> >
> >> Hi!
> >>
> >> On 9/15/17 1:06 PM, Andrew Otto wrote:
>  As a random idea - would it be possible to calculate the hashes
> >>> when data is transitioned from SQL to Hadoop storage?
> >>>
> >>> We take monthly snapshots of the entire history, so every month we’d
> >>> have to pull the content of every revision ever made :o
> >>
> >> Why? If you already seen that revision in previous snapshot, you'd
> >> already have its hash? Admittedly, I have no idea how the process works,
> >> so I am just talking out of general knowledge and may miss some things.
> >> Also of course you already have hashes from revs till this day and up to
> >> the day we decide to turn the hash off. Starting that day, it'd have to
> >> be generated, but I see no reason to generate one more than once?
> >> --
> >> Stas Malyshev
> >> smalys...@wikimedia.org
> >>
> >> ___
> >> Wikitech-l mailing list
> >> Wikitech-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>
> >
> >
> >
>
>
> --
> Daniel Kinzler
> Principal Platform Engineer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Log of nulledits

2017-07-25 Thread John
Bináris the root issue is that mediawiki doesnt see Null edits as an edit,
it sees them more as a purge with forcelinkupdate=True. The logs that
contain that information are not in mediawiki, but rather the webserver
logs. Exposing those logs is a privacy issue. Second if there are
performance issues the Ops staff have access to those logs and will use
them to troubleshoot. No one on wiki can do anything about Null Bots, nor
should it be something they worry about.

On Tue, Jul 25, 2017 at 8:58 AM, Bináris  wrote:

> 2017-07-25 14:35 GMT+02:00 Dan Garry :
>
> > By definition, a null edit does not perform any change at all, and is
> > therefore not recorded publicly since there's technically nothing to
> > record. I suspect the only way you could find this kind of information is
> > in the server logs, and access to those is very tightly restricted for
> > privacy reasons.
> >
>
> I understand, but I think it would be worth to discuss this *therefore.*
> Nulledits are not subjects of privacy protection, they are now in protected
> logs only accidentally or for historical reasons. If there is an action
> noticed by the server (definitely there is, because it has an effect on the
> page, that's why people often do it), it may be logged in the way real
> edits are.
>
> This would be also be useful for researchers. One may be interested in the
> pattern of null edits, the quantity of them (e.g. is it useful to null edit
> 20.000 pages because of the change of a template, or is it actual to find
> some better way of making the changes visible?).
> If there is no reason to exclude these from logs (I don't see any), we
> should make them visible. Perhaps not by default, but with a switch.
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Get Wikipedia Page Titles using API looks Endless

2017-05-07 Thread John
those are official, I ran the report from toollabs which is Wikimedia's
developer platform which includes a copy of en.Wikipedia's database (with
sensitive fields removed). Without looking at your code and doing some
testing, which unfortunately I don't have the time for, I cannot help
debugging why your code isn't working. Those two files where created by
running "sql enwiki_p "select page_namespace from page where
page_is_redirect =0 and page_namespace = 0;"> ns_0.txt" then compressing
the resulting text file via 7zip. For the category namespace I just changed
page_namespace = 0 to page_namespace = 14,

On Sun, May 7, 2017 at 3:41 AM, Abdulfattah Safa <fattah.s...@gmail.com>
wrote:

> hello John,
> Thanks for your effort. Actually I need official dumps as I need to use
> them in my thesis.
> Could you please point me how did you get these ones?
> Also, any idea why the API doesn't work properly for en Wikipedia? I use
> the same code for other language and it worked.
>
> Thanks,
> Abed,
>
> On Sun, May 7, 2017 at 1:45 AM John <phoenixoverr...@gmail.com> wrote:
>
> > Here you go
> > ns_0.7z <http://tools.wmflabs.org/betacommand-dev/reports/ns_0.7z>
> > ns_14.7z <http://tools.wmflabs.org/betacommand-dev/reports/ns_14.7z>
> >
> > On Sat, May 6, 2017 at 5:27 PM, John <phoenixoverr...@gmail.com> wrote:
> >
> > > Give me a few minutes I can get you a database dump of what you need.
> > >
> > > On Sat, May 6, 2017 at 5:25 PM, Abdulfattah Safa <
> fattah.s...@gmail.com>
> > > wrote:
> > >
> > >> 1. I'm usng max as a limit parameter
> > >> 2. I'm not sure if the dumps have the data I need. I need to get the
> > >> titles
> > >> for all Articles (name space = 0), with no redirects and also need the
> > >> titles of all Categories (namespace = 14) without redirects
> > >>
> > >> On Sat, May 6, 2017 at 11:39 PM Eran Rosenthal <eranro...@gmail.com>
> > >> wrote:
> > >>
> > >> > 1. You can use limit parameter to get more titles in each request
> > >> > 2. For getting many entries it is recommended to extract from dumps
> or
> > >> from
> > >> > database using quarry
> > >> >
> > >> > On May 6, 2017 22:36, "Abdulfattah Safa" <fattah.s...@gmail.com>
> > wrote:
> > >> >
> > >> > > for the & in $Continue=-||, it's a type. It doesn't exist in the
> > code.
> > >> > >
> > >> > > On Sat, May 6, 2017 at 10:12 PM Abdulfattah Safa <
> > >> fattah.s...@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > I'm trying to get all the page titles in Wikipedia in namespace
> > >> using
> > >> > the
> > >> > > > API as following:
> > >> > > >
> > >> > > > https://en.wikipedia.org/w/api.php?action=query=
> > >> > > xml=allpages=0=nonredirects&
> > >> > > aplimit=max&$continue=-||$apcontinue=BASE_PAGE_TITLE
> > >> > > >
> > >> > > > I keep requesting this url and checking the response if contains
> > >> > continue
> > >> > > > tag. if yes, then I use same request but change the
> > *BASE_PAGE_TITLE
> > >> > *to
> > >> > > > the value in apcontinue attribute in the response.
> > >> > > > My applications had been running since 3 days and number of
> > >> retrieved
> > >> > > > exceeds 30M, whereas it is about 13M in the dumps.
> > >> > > > any idea?
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > ___
> > >> > > Wikitech-l mailing list
> > >> > > Wikitech-l@lists.wikimedia.org
> > >> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >> > ___
> > >> > Wikitech-l mailing list
> > >> > Wikitech-l@lists.wikimedia.org
> > >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >> ___
> > >> Wikitech-l mailing list
> > >> Wikitech-l@lists.wikimedia.org
> > >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >>
> > >
> > >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Get Wikipedia Page Titles using API looks Endless

2017-05-06 Thread John
Here you go
ns_0.7z <http://tools.wmflabs.org/betacommand-dev/reports/ns_0.7z>
ns_14.7z <http://tools.wmflabs.org/betacommand-dev/reports/ns_14.7z>

On Sat, May 6, 2017 at 5:27 PM, John <phoenixoverr...@gmail.com> wrote:

> Give me a few minutes I can get you a database dump of what you need.
>
> On Sat, May 6, 2017 at 5:25 PM, Abdulfattah Safa <fattah.s...@gmail.com>
> wrote:
>
>> 1. I'm usng max as a limit parameter
>> 2. I'm not sure if the dumps have the data I need. I need to get the
>> titles
>> for all Articles (name space = 0), with no redirects and also need the
>> titles of all Categories (namespace = 14) without redirects
>>
>> On Sat, May 6, 2017 at 11:39 PM Eran Rosenthal <eranro...@gmail.com>
>> wrote:
>>
>> > 1. You can use limit parameter to get more titles in each request
>> > 2. For getting many entries it is recommended to extract from dumps or
>> from
>> > database using quarry
>> >
>> > On May 6, 2017 22:36, "Abdulfattah Safa" <fattah.s...@gmail.com> wrote:
>> >
>> > > for the & in $Continue=-||, it's a type. It doesn't exist in the code.
>> > >
>> > > On Sat, May 6, 2017 at 10:12 PM Abdulfattah Safa <
>> fattah.s...@gmail.com>
>> > > wrote:
>> > >
>> > > > I'm trying to get all the page titles in Wikipedia in namespace
>> using
>> > the
>> > > > API as following:
>> > > >
>> > > > https://en.wikipedia.org/w/api.php?action=query=
>> > > xml=allpages=0=nonredirects&
>> > > aplimit=max&$continue=-||$apcontinue=BASE_PAGE_TITLE
>> > > >
>> > > > I keep requesting this url and checking the response if contains
>> > continue
>> > > > tag. if yes, then I use same request but change the *BASE_PAGE_TITLE
>> > *to
>> > > > the value in apcontinue attribute in the response.
>> > > > My applications had been running since 3 days and number of
>> retrieved
>> > > > exceeds 30M, whereas it is about 13M in the dumps.
>> > > > any idea?
>> > > >
>> > > >
>> > > >
>> > > ___
>> > > Wikitech-l mailing list
>> > > Wikitech-l@lists.wikimedia.org
>> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> > ___
>> > Wikitech-l mailing list
>> > Wikitech-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Get Wikipedia Page Titles using API looks Endless

2017-05-06 Thread John
Give me a few minutes I can get you a database dump of what you need.

On Sat, May 6, 2017 at 5:25 PM, Abdulfattah Safa 
wrote:

> 1. I'm usng max as a limit parameter
> 2. I'm not sure if the dumps have the data I need. I need to get the titles
> for all Articles (name space = 0), with no redirects and also need the
> titles of all Categories (namespace = 14) without redirects
>
> On Sat, May 6, 2017 at 11:39 PM Eran Rosenthal 
> wrote:
>
> > 1. You can use limit parameter to get more titles in each request
> > 2. For getting many entries it is recommended to extract from dumps or
> from
> > database using quarry
> >
> > On May 6, 2017 22:36, "Abdulfattah Safa"  wrote:
> >
> > > for the & in $Continue=-||, it's a type. It doesn't exist in the code.
> > >
> > > On Sat, May 6, 2017 at 10:12 PM Abdulfattah Safa <
> fattah.s...@gmail.com>
> > > wrote:
> > >
> > > > I'm trying to get all the page titles in Wikipedia in namespace using
> > the
> > > > API as following:
> > > >
> > > > https://en.wikipedia.org/w/api.php?action=query=
> > > xml=allpages=0=nonredirects&
> > > aplimit=max&$continue=-||$apcontinue=BASE_PAGE_TITLE
> > > >
> > > > I keep requesting this url and checking the response if contains
> > continue
> > > > tag. if yes, then I use same request but change the *BASE_PAGE_TITLE
> > *to
> > > > the value in apcontinue attribute in the response.
> > > > My applications had been running since 3 days and number of retrieved
> > > > exceeds 30M, whereas it is about 13M in the dumps.
> > > > any idea?
> > > >
> > > >
> > > >
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Fair use image indicated as free use?

2017-04-21 Thread John
ideally that span should be added by {{non-free media}} which is the meta
template for identifying non-free media.

On Fri, Apr 21, 2017 at 4:28 PM, Derk-Jan Hartman <
d.j.hartman+wmf...@gmail.com> wrote:

> It might have been on purpose that there was no licensetpl_nonfree. The
> reason being, that even though that template looks like a license it isn't
> an actual license. Instead we use fair use rationale's, and these do
> have licensetpl_nonfree. However this specific file was not using the right
> template for machine recognizable fair use rationales.
>
> Fixed in:
> https://en.wikipedia.org/w/index.php?title=File:Odin_
> lloyd.jpg=776566993=674493870
>
> On Fri, Apr 21, 2017 at 8:40 PM, Brad Jorsch (Anomie) <
> bjor...@wikimedia.org
> > wrote:
>
> > On Fri, Apr 21, 2017 at 2:20 PM, Fako Berkers 
> > wrote:
> >
> > > I'm running the tool algo-news and I discovered that this image:
> > > https://en.wikipedia.org/wiki/File:Odin_lloyd.jpg
> > > Is indicated as a free image in the API:
> > > https://en.wikipedia.org/w/api.php?format=json=
> > > pageprops=query=1=39787564
> > >
> > > Should I report this as a bug?
> >
> >
> > That feature uses the presence of a particular hidden  in the HTML
> of
> > the page to determine non-freeness. On that particular image, the license
> > template was missing that span,[1] and it also didn't use the standard
> > non-free use rationale template[2] which also would have added the needed
> > span.
> >
> > Then null edits to the image and the page fixed things up.
> >
> >  [1]: Fixed in
> > https://en.wikipedia.org/w/index.php?title=Template:Non-
> > free_fair_use=776553380=749836414
> >  [2]: https://en.wikipedia.org/wiki/Template:Non-free_use_rationale
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Fair use image indicated as free use?

2017-04-21 Thread John
Not sure how that metadata is being populated but on enwiki you should look
for either Template:Non-free media
 or Category:All
non-free media 

On Fri, Apr 21, 2017 at 2:20 PM, Fako Berkers  wrote:

> Hi,
>
> I'm running the tool algo-news and I discovered that this image:
> https://en.wikipedia.org/wiki/File:Odin_lloyd.jpg
> Is indicated as a free image in the API:
> https://en.wikipedia.org/w/api.php?format=json=
> pageprops=query=1=39787564
>
> Should I report this as a bug? Or if the "page_image_free" does not
> indicate free use, how can I determine easily whether an image is indeed
> free use instead of only fair use?
>
> Thanks,
>
> Fako
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] CORS Error in VisualEditor/Parsoid/RESTBase Setup

2017-03-04 Thread John P. New
I am trying to get VisualEditor/Parsoid/RESTBase set up for a private wiki. So 
far this list has helped me past a couple of roadblocks and I am back with 
another one. 

Currently, I am running only VE and Parsoid successfully: I can create, edit 
and save pages in VE. I want to add in the RESTBase server so that I can switch 
from wikitext editing to VE and save changes made in wikitext.

I am running MediaWiki 1.28 on a shared host (sharedhost.example.com) and 
Parsoid and RESTBase on my home server (homeserver.example.com). I have 
installed an SSL certificate on homeserver.example.com and serve Parsoid & 
RESTBase through stunnel. The IP address for homeserver.example.com is resolved 
by the DNS on the shared host. Note that the domain structure in my example is 
the same as in reality: the two servers share a common base domain 
(example.com) with different sub-domains (sharedhost. and homeserver.)

Now when I try to open a page for editing in VE, I get the following error in 
the Firefox console (URLs changed to match the example situation above):
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the 
remote resource at 
https://homeserver.example.com:7232/sharedhost.example.com/v1/transform/wikitext/to/html//1184.
 (Reason: CORS header ‘Access-Control-Allow-Origin’ missing).  (unknown)
RESTBase load failed: error

At this point I get an error dialog in Firefox that says, "Error loading data 
from server: Could not connect to the server. Would you like to retry?"

I should point out that I am not the only one experiencing this situation. See 
https://www.mediawiki.org/w/index.php?title=Topic:Tm2qsg4ywsykmahr

What can we do to solve the CORS issue?

John

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Integrating Parsoid & RESTbase into a secure MediaWiki Install

2017-02-22 Thread John P. New
To wrap this up, I managed to get SSL to RESTbase working, but using CPanel 
AutoSSL instead of LetsEncrypt. AutoSSL and LetsEncrypt are similar services 
but I used AutoSSL because my main mediawiki install is served on CPanel.

In a nutshell, I
) Created a sub-domain, mw.mywikidomain.com. The associated SSL certificate was 
automatically created.
) Pointed the DNS entry for mw.mywikidomain.com to my home server IP address
) Exported the certificate and key entries into an stunnel install running on 
my home server and listening on port 7232.
) Changed the $wgVisualEditorRestbaseURL and $wgVisualEditorFullRestbaseURL to 
point to https://mw.mydomain.com:7232/mywikidomain.com/v1/page/html/ and 
https://mw.mydomain.com:7232/mywikidomain.com/, respectively.

Thanks for the help.

John

On February 22, 2017 10:14:54 AM John P. New wrote:
> Thanks to a couple of members of this list I was able to get Visual Editor 
> working on my WikiMedia install.
> 
> Now I would like to run the wiki under SSL. Of course, as soon as I do, my 
> browser complains of mixed content from the RESTbase server and won't load VE 
> at all.
> 
> I am running MediaWiki 1.28 on a shared host, which means no access to 
> node.js. So in order to run Parsoid and RESTbase I have installed both on my 
> home server. As such, I have no way of getting a trusted SSL certificate for 
> it; the most I could do is a self-signed certificate, which I am sure will 
> cause as many browser complaints as the current mixed-content does.
> 
> My question is, what is the likelihood of getting this configuration to work 
> under SSL?
> 
> John


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Integrating Parsoid & RESTbase into a secure MediaWiki Install

2017-02-22 Thread John P. New
I had thought of LetsEncrypt, but I was under the (mistaken) impression that 
SSL certificates are bound to the IP address as well as the hostname of the 
server. Upon further investigation, I see that SSL certificates are not IP 
address dependent.

I'll give it a try, thanks.

On February 22, 2017 04:57:30 PM Alex Monk wrote:
> You can get a trusted cert for your home server. Look into LetsEncrypt.
> 
> On 22 Feb 2017 3:15 pm, "John P. New" <wikit...@hazelden.ca> wrote:
> 
> > Thanks to a couple of members of this list I was able to get Visual Editor
> > working on my WikiMedia install.
> >
> > Now I would like to run the wiki under SSL. Of course, as soon as I do, my
> > browser complains of mixed content from the RESTbase server and won't load
> > VE at all.
> >
> > I am running MediaWiki 1.28 on a shared host, which means no access to
> > node.js. So in order to run Parsoid and RESTbase I have installed both on
> > my home server. As such, I have no way of getting a trusted SSL certificate
> > for it; the most I could do is a self-signed certificate, which I am sure
> > will cause as many browser complaints as the current mixed-content does.
> >
> > My question is, what is the likelihood of getting this configuration to
> > work under SSL?
> >
> > John


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Integrating Parsoid & RESTbase into a secure MediaWiki Install

2017-02-22 Thread John P. New
Thanks to a couple of members of this list I was able to get Visual Editor 
working on my WikiMedia install.

Now I would like to run the wiki under SSL. Of course, as soon as I do, my 
browser complains of mixed content from the RESTbase server and won't load VE 
at all.

I am running MediaWiki 1.28 on a shared host, which means no access to node.js. 
So in order to run Parsoid and RESTbase I have installed both on my home 
server. As such, I have no way of getting a trusted SSL certificate for it; the 
most I could do is a self-signed certificate, which I am sure will cause as 
many browser complaints as the current mixed-content does.

My question is, what is the likelihood of getting this configuration to work 
under SSL?

John

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] VisualEditor discards changes made in Wikitext editor

2017-02-21 Thread John P. New
Alex and James,

Thanks for the guidance. I've set up the RESTbase server and it's working well: 
wikitext editor changes are reflected in VE when switching (and vice versa).

John

On February 20, 2017 09:13:58 PM Alex Monk wrote:
> One of the things that should probably be noted is that, if I recall
> correctly, this was the feature that required the addition of the
> wgVisualEditorFullRestbaseURL configuration - previously there was just
> wgVisualEditorRestbaseURL. You'll need to set up RB and point that config
> var at it.
> 
> On 20 February 2017 at 21:02, John P. New <wikit...@hazelden.ca> wrote:
> 
> > On February 20, 2017 08:34:35 PM James Forrester wrote:
> > > On Mon, 20 Feb 2017 at 10:42 John P. New <wikit...@hazelden.ca> wrote:
> > >
> > > > I've set up MW 1.28 with VE/parsoid and everything is working well
> > > >
> > > > However, when I edit in (any) wikitext editor and try to switch to VE,
> > I
> > > > am presented with a dialog with only 2 choices: "Cancel" or "Discard my
> > > > changes and switch". Diving into the code I found I could change
> > > > ve.init.MWVESwitchConfirmDialog.js (line 58) from
> > > >   modes: [ 'restbase' ]
> > > > to
> > > >   modes: [ 'restbase', 'simple' ]
> > > >
> > > > that adds "Switch" to the dialog box, but when this option is chosen,
> > any
> > > > changes made in the wikitext editor are lost.
> > > >
> > > > Is this a mis-configuration on my part or is switching from edited
> > > > wikitext to VE not supported? Do I need a RESTbase server to implement
> > this
> > > > functionality?
> > > >
> > >
> > > RESTbase provides the switching-with-changes ability, yes. There's a
> > reason
> > > VE doesn't offer switching-with-changes without it. :-) In general,
> > > fiddling with the code inside an extension is always going to break.
> > >
> > > J.
> > >
> > Thanks for the clarification.
> >
> > Should this be noted in the setup instructions for Visual Editor? It would
> > have saved me a lot of head-scratching and investigation time.
> >
> > And yes, I realize messing with the code will break things, but it was my
> > way of finding out that I might need a RESTbase server and to point to
> > where my problem was. :-)
> >
> > John
> >
> >
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] VisualEditor discards changes made in Wikitext editor

2017-02-20 Thread John P. New
On February 20, 2017 08:34:35 PM James Forrester wrote:
> On Mon, 20 Feb 2017 at 10:42 John P. New <wikit...@hazelden.ca> wrote:
> 
> > I've set up MW 1.28 with VE/parsoid and everything is working well
> >
> > However, when I edit in (any) wikitext editor and try to switch to VE, I
> > am presented with a dialog with only 2 choices: "Cancel" or "Discard my
> > changes and switch". Diving into the code I found I could change
> > ve.init.MWVESwitchConfirmDialog.js (line 58) from
> >   modes: [ 'restbase' ]
> > to
> >   modes: [ 'restbase', 'simple' ]
> >
> > that adds "Switch" to the dialog box, but when this option is chosen, any
> > changes made in the wikitext editor are lost.
> >
> > Is this a mis-configuration on my part or is switching from edited
> > wikitext to VE not supported? Do I need a RESTbase server to implement this
> > functionality?
> >
> 
> RESTbase provides the switching-with-changes ability, yes. There's a reason
> VE doesn't offer switching-with-changes without it. :-) In general,
> fiddling with the code inside an extension is always going to break.
> 
> J.
> 
Thanks for the clarification.

Should this be noted in the setup instructions for Visual Editor? It would have 
saved me a lot of head-scratching and investigation time.

And yes, I realize messing with the code will break things, but it was my way 
of finding out that I might need a RESTbase server and to point to where my 
problem was. :-)

John



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] VisualEditor discards changes made in Wikitext editor

2017-02-20 Thread John P. New
I've set up MW 1.28 with VE/parsoid and everything is working well

However, when I edit in (any) wikitext editor and try to switch to VE, I am 
presented with a dialog with only 2 choices: "Cancel" or "Discard my changes 
and switch". Diving into the code I found I could change 
ve.init.MWVESwitchConfirmDialog.js (line 58) from
  modes: [ 'restbase' ]
to
  modes: [ 'restbase', 'simple' ]

that adds "Switch" to the dialog box, but when this option is chosen, any 
changes made in the wikitext editor are lost.

Is this a mis-configuration on my part or is switching from edited wikitext to 
VE not supported? Do I need a RESTbase server to implement this functionality?


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Creating a broken link

2017-01-25 Thread John
Does the page exist already?

On Wed, Jan 25, 2017 at 8:23 AM Victor Porton  wrote:

> How to create a broken ("edit", "red") link to a page?
>
> That is I want to generate a HTML code which displays a link, clicking
> which leads to the editor (for a page). The link should be red.
>
> What is the right way to do this?
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Discussion Platform

2016-12-10 Thread John
Depends on how you define easy.
http://bots.wmflabs.org/~wm-bot/logs/%23mediawiki/ is a recording of
everything in #mediawiki by date, oldest at the top, newest at the bottom.
I would consider that fairly easy.

On Sat, Dec 10, 2016 at 5:39 PM, Cyken Zeraux <cykenzer...@gmail.com> wrote:

> Public logging isn't very accessible. When you join in a chat after being
> disconnected (like IRC does all the time), you'd like to look through the
> previous discussions easily.
>
> On Sat, Dec 10, 2016 at 4:30 PM, John <phoenixoverr...@gmail.com> wrote:
>
> > An issue was raised about only seeing IRC messages while logged in,
> however
> > the WMF does publicly log several of their channels, so that is a moot
> > point
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Discussion Platform

2016-12-10 Thread John
An issue was raised about only seeing IRC messages while logged in, however
the WMF does publicly log several of their channels, so that is a moot point
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Xmldatadumps-l] Wikipedia page IDs

2016-12-03 Thread John
It looks like the page was deleted/restored thus giving it a new page ID.
Originally when pages where deleted the page_id was not kept, which caused
a new page_id to be issued when it was restored. This phenomenon has since
been fixed, and should no longer happen.

On Sat, Dec 3, 2016 at 8:47 AM, Renato Stoffalette Joao  wrote:

> Hi all.
>
> Firstly, apologies for eventual duplicates or posting the question in the
> wrong mailing  list.
>
> Secondly, could anybody kindly explain to me if some Wikipedia pages
> changed their IDs from the past ? Or if so point to me where this might be
> documented ?
> I have Wikipedia pages-articles  XML dumps from the years 2006 and 2008
> and  when I was parsing those dumps I ran across some situations
> such as the following one. In the dumps from 2006 and 2008 I found that
> the South Africa page has the ID 68854, while in the most current Wikipedia
> pages-articles XML dump (i.e. 2016) the same article has the ID  17416221.
> I am trying to match some Wiki pages by IDs across time, but the example
> above is not helping.
>
> Much appreciated in advance for any help.
>
> --
> Renato Stoffalette Joao
> - PhD Student -
> L3S Research Center / Leibniz Uni.
> 15th Floor, Room:1519
> Appelstraße 9a
> 30167 Hannover, Germany
> +49.511.762-17759
>
>
> ___
> Xmldatadumps-l mailing list
> xmldatadump...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Google Code-in 2016 just started and we need your help!

2016-12-03 Thread John Mark Vandenberg
What happens if a mentor doesn't respond with 36 hrs?
The weekend is going to be the most problematic stage, especially this
first weekend.

Does this "36 hrs" also apply to reviewing in Gerrit and comments in
Phabricator?

I am seeing students not submit their GCI task for approval once they
have uploaded their patch into Gerrit.  This is actually "good" in one
sense, as they haven't finished the task until it is merged, but if we
are not tracking Gerrit review times of GCI patches, it means the
mentors are not obligated to do code review with 36 hrs and the
participants are surprised at how long the code review phase is
taking. (48 hrs in one case).

Is there any warning when the 36 hour limit is approaching?

Are Wikimedia org admins watching this limit somehow?
Is there some process in place?
e.g. 24 hr "this is getting worrying" status, where we find another
mentor / code reviewer?


On Tue, Nov 29, 2016 at 12:00 AM, Andre Klapper <aklap...@wikimedia.org> wrote:
> Google Code-in 2016 just started:
> https://www.mediawiki.org/wiki/Google_Code-in_2016
>
> In the next seven weeks, many young people are going to make their
> first contributions to Wikimedia. Expect many questions on IRC and
> mailing lists by onboarding newcomers who have never used IRC or lists
> before. Your help and patience is welcome to provide a helping hand!
>
> Thanks to all mentors who have already registered & provided tasks!
>
> You have not become a mentor yet? Please do consider it.
> It is fun and we do need more tasks! :)
>
> * Think of easy tasks in your area that you could mentor.
>   Areas are: Code, docs/training, outreach/research, quality
>   assurance, and user interface. "Easy" means 2-3h to complete for
>   you, or less technical ~30min "beginner tasks" for onboarding).
> * OR: provide an easy 'clonable' task (a task that is generic and
>   could be repeated many times by different students).
> * Note that you commit to answer to students' questions and to
>   evaluate their work within 36 hours (but the better your task
>   description the less questions. No worries, we're here to help!)
>
> For the full info, please check out
> https://www.mediawiki.org/wiki/Google_Code-in_2016/Mentors
> and ask if something is unclear!
>
> Thank you again for giving young contributors the opportunity to learn
> about and work on all aspects of Free & Open Source Software projects!
>
> Cheers,
> andre
> --
> Andre Klapper | Wikimedia Bugwrangler
> http://blogs.gnome.org/aklapper/
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Update on WMF account compromises

2016-11-21 Thread John Mark Vandenberg
On Mon, Nov 21, 2016 at 10:42 PM, Bartosz Dziewoński
<matma@gmail.com> wrote:
> Just for the record, I'm not having such problems, so it might be in some
> way specific to you. I've heard someone else recently complaining about
> getting logged in often, I don't think this is related to 2FA.
>
> If you need to disable it, you can do it yourself (visit Preferences, click
> "Disable two-factor authentication" and follow the steps).

I switch devices regularly, and switch browsers also.
Desktop session continues without a hitch, mostly.
Mobile devices are always being logged out.

-- 
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Update on WMF account compromises

2016-11-21 Thread John Mark Vandenberg
Ya, this is why I haven't done it.

Also, I should be able to set it up such that TFA is not necessary
until my account attempts to do an admin action.

On Mon, Nov 21, 2016 at 4:37 PM, Florence Devouard <fdevou...@gmail.com> wrote:
> Hello
>
> I had the super bad idea of implementing the two-factor authentication and
> now I need help :)
>
> The system is not "recording" me as registered. Which means that I am
> disconnected every once in a while. Roughly every 15 minutes... and every
> time I change project (from Wikipedia to Commons etc.)
>
> Which means that every 15 minutes, I need to relogin... retype login and
> password... grab my phone... wake it up... launch the app... get the
> number... enter it... validate... OK, good to go for 15 minutes...
>
> So... how do I fix that ?
>
> Thanks
>
> Florence
>
>
> Le 16/11/2016 à 10:57, Tim Starling a écrit :
>>
>> Since Friday, we've had a slow but steady stream of admin account
>> compromises on WMF projects. The hacker group OurMine has taken credit
>> for these compromises.
>>
>> We're fairly sure now that their mode of operation involves searching
>> for target admins in previous user/password dumps published by other
>> hackers, such as the 2013 Adobe hack. They're not doing an online
>> brute force attack against WMF. For each target, they try one or two
>> passwords, and if those don't work, they go on to the next target.
>> Their success rate is maybe 10%.
>>
>> When they compromise an account, they usually do a main page
>> defacement or similar, get blocked, and then move on to the next target.
>>
>> Today, they compromised the account of a www.mediawiki.org admin, did
>> a main page defacement there, and then (presumably) used the same
>> password to log in to Gerrit. They took a screenshot, sent it to us,
>> but took no other action.
>>
>> So, I don't think they are truly malicious -- I think they are doing
>> it for fun, fame, perhaps also for their stated goal of bringing
>> attention to poor password security.
>>
>> Indications are that they are familiarising themselves with MediaWiki
>> and with our community. They probably plan on continuing to do this
>> for some time.
>>
>> We're doing what we can to slow them down, but admins and other users
>> with privileged access also need to take some responsibility for the
>> security of their accounts. Specifically:
>>
>> * If you're an admin, please enable two-factor authentication.
>> <https://meta.wikimedia.org/wiki/H:2FA>
>> * Please change your password, if you haven't already changed it in
>> the last week. Use a new password that is not used on any other site.
>> * Please do not share passwords across different WMF services, for
>> example, between the wikis and Gerrit.
>>
>> (Cross-posted to wikitech-l and wikimedia-l, please copy/link
>> elsewhere as appropriate.)
>>
>> -- Tim Starling
>>
>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Engaging student devs with another outreach event like GSoC, GCI

2016-11-07 Thread John Mark Vandenberg
This is probably a good thread to introduce a program we have been
running in Indonesia called Besut Kode, funded by Ford Foundation.

We noticed there were not many GCI/GSOC participants from Indonesia,
and also not many Wikimedia devs from Indonesia, and are trying to fix
that.

We are using a competitive training program format, with eliminations
and prizes, to ensure that we spend more time mentoring the
participants with the most potential to succeed in the OSS world.

The project has two halves, the first targeting high school (called
SMA) students preparing them for GCI, and then the second targeting
University students preparing them for GSOC.  Between them we have had
almost 1000 registrations.

The participant list for both is at:
https://github.com/BesutKode/BesutKode.github.io

The high school program is entirely in private repositories, allowing
the students to make all kinds of mistakes, and learn from them, with
the goal being to submit a real significant patch an to OSS project.
Six have finished the program, with merged contributions to real OSS
projects, and a few more are likely to finish it before GCI starts,
but have struggled to fit it in with their other activities.

The program for university students is more public.  The English
version of the program is at

http://wikimedia-id.github.io/besutkode/university-modules-en.html

One of the features is that we eliminate participants if they are not
active on GitHub every three days, requiring that they complete a
small patch to a pre-selected set of repositories that have increasing
difficulty and slowly moving them more towards GSOC relevant
repositories.

http://wikimedia-id.github.io/besutkode/university-activity-repositories-en.html

You can see their ongoing activity at http://tinyurl.com/bku-other-repos .

In addition, they have to work on some quite difficult tasks, which
they can work on together but must have distinct solutions.

The first of these large tasks is public at
https://github.com/BesutKode/uni-task-1

So far, nine participants have completed that task and are now working
on the second task.

https://github.com/orgs/BesutKode/teams/peserta-universitas-task-2

All of the program materials will be public and CC-BY after the
competition is over, as is required by all Ford grants.

-- 
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

  1   2   3   4   5   6   >