[Wikitech-l] Re: ILocalizedException refactor

2021-09-11 Thread Gergő Tisza
Are the larger plans around refactoring the message system written down

On Thu, Sep 9, 2021 at 11:51 AM Thomas Chin  wrote:

> Hello everyone,
> We are planning to rework ILocalizedException, which will break anything
> implementing the interface. Ticket: T287405
> TL;DR: The change will deprecate ILocalizedException::getMessageObject()
> and replace it with ::getMessageValue()
> Of course, we will take care of updating affected Wikimedia deployed
> extensions.
> Please refer to the ticket and comment if you have any questions or
> concerns.
> Best,
> --
> *Thomas Chin * (he/him)
> Software Engineer - Platform Engineering
> Wikimedia Foundation 
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org

[Wikitech-l] Code style poll: return typehint spacing

2021-06-20 Thread Gergő Tisza
There's an ongoing discussion about how PHP return typehints should be
formatted in MediaWiki code:

function foo(): int {


function foo() : int {

If you are interested, please vote here:

(The permission system for votes is somewhat inflexible. If you want to
vote but can't, please leave a comment.)
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org

Re: [Wikitech-l] Allow HTML email

2020-09-22 Thread Gergő Tisza
Yes please. A mere fifty years after the invention of hyperlinks, it would
be great to adopt them here.
Wikitech-l mailing list

Re: [Wikitech-l] Ethical question regarding some code

2020-08-09 Thread Gergő Tisza
FWIW, the movement strategy included a recommendation* for having a
technology ethics review process [1]; maybe this is a good opportunity to
experiment with creating a precursory, unofficial version of that - make a
wiki page for the sock puppet detection tool, and a proposal process for
such pages, and consider where we could source expert advice from.

* More precisely, it was a draft recommendation. The final recommendations
were significantly less fine-grained.
Wikitech-l mailing list

Re: [Wikitech-l] Ethical question regarding some code

2020-08-09 Thread Gergő Tisza
On Fri, Aug 7, 2020 at 6:39 PM Ryan Kaldari  wrote:

> Whatever danger is embodied in Amir's code, it's only a matter of time
> before this danger is ubiquitous. And for the worst-case
> scenario—governments using the technology to hunt down dissidents—I imagine
> this is already happening. So while I agree there is a moral consideration
> to releasing this software, I think the moral implications aren't actually
> that huge. Eventually, we will just have to accept that creating separate
> accounts is not an effective way to protect your identity.

Deanonymizing wiki accounts is one way of misusing the tool, and one which
would indeed happen anyway. Another scenario is an attacker examining the
tool with the intent of misleading it (such as using an adversarial network
to construct edits which the tool would consistently misidentify as
belonging to a certain user, which could be used to cast suspicion on a
legitimate user). That specifically depends on the model being publicly
Wikitech-l mailing list

Re: [Wikitech-l] Ethical question regarding some code

2020-08-09 Thread Gergő Tisza
On Sat, Aug 8, 2020 at 7:43 PM Amir Sarabadani  wrote:

> * By closed source, I don't mean it will be only accessible to me, It's
> already accessible by another CU and one WMF staff, and I would gladly
> share the code with anyone who has signed NDA and they are of course more
> than welcome to change it. Github has a really low limit for people who can
> access a private repo but I would be fine with any means to fix this.

Closed source is commonly understood to mean the code is not under an
OSI-approved open-source license (such code is banned from Toolforge).
Contrary to common misconceptions, many OSI-approved open-source licenses
(such as the GPL) allow keeping the code private, as long as the software
itself is also kept private. IMO it would be less confusing to use the
"public"/"private" terminology here - yes the code should be open-sourced,
but that's mostly orthogonal to the concerns discussed here.

* It has been pointed out by people in the checkuser mailing list that
> there's no point in logging accessing this tool, since the code is
> accessible to CUs (if they want to), so they can download and run it on
> their computer without logging anyway.

There's a significant difference between your actions not being logged vs.
your actions being logged unless you actively circumvent the logging (in
ways which would probably seem malicious). Clear red lines work well in a
community project even when there's nothing physically stopping people from
stepping over them.

* There is a huge difference between CU and this AI tool in matters of
> privacy. While both are privacy sensitive but CU reveals much more, as a
> CU, I know where lots of people are living or studying because they showed
> up in my CUs (...) but this tool only reveals a connection between
> accounts if
> one of them is linked to a public identity and the other is not which I
> wholeheartedly agree is not great but it's not on the same level as seeing
> people's IPs.

On the other hand, IP checks are very unreliable. A hypothetical tool that
is reliable would be a bigger privacy concern, since it would be used more
often and more successfully to extract private details.
(On the other other hand, as a Wikipedia editor I have a reasonable
expectation of privacy of the site not telling its administrators where I
live. Do I have a reasonable expectation of privacy for not telling them
what my alt accounts are? Arguably not.)

Also, how much help would such a tool be in off-wiki stylometry? If it can
be used (on its own or with additional tooling) to connect wiki accounts to
other online accounts, that would subjectively seem to me to have a
significantly larger privacy impact than IP addresses.
Wikitech-l mailing list

Re: [Wikitech-l] Ethical question regarding some code

2020-08-06 Thread Gergő Tisza
Technically, you can make the tool open-source and keep the source code
secret. That solves the maintenance problem (others who get access can
legally modify). Of course, you'd have to trust everyone with access to the
files to not publish them which they would be technically entitled to
(unless there is some NDA-like mechanism).

Transparency and auditability wouldn't be fulfilled just by making the code
public, anyway; they need to be solved by tool design (keeping logs,
providing feedback options for the users, trying to expose the components
of the decision as much as possible).

I'd agree with Bawolff though that there is probably no point in going to
great lengths to keep details secret as creating a similar tool is probably
not that hard. You can build some assumptions into the tool which are
nontrivial to fulfill outside Toolforge (e.g. use the replicas instead of
dumps) to make running it require an effort, at least.
Wikitech-l mailing list

[Wikitech-l] Feedback requested: dropping PHP 7.2 support for MediaWiki 1.35

2020-07-28 Thread Gergő Tisza
Hi all,

https://phabricator.wikimedia.org/T257879 recommends dropping support for
PHP 7.2 in the upcoming MediaWiki 1.35 release. (It would still be
supported in master as it will probably take months for Wikimedia
production to switch.) Tl;dr: 1.35 is an LTS release which we'll support
for 3 years, and supporting an old PHP version in an LTS release tends to
be inconvenient in a number of ways. More details in the task.

Your feedback in the task would be appreciated, especially if you would be
affected by the change in a positive or negative way.
Wikitech-l mailing list

Re: [Wikitech-l] Reason for actionthrottledtext API blocking?

2020-01-13 Thread Gergő Tisza
On Mon, Jan 13, 2020 at 6:57 AM Baskauf, Steven James <
steve.bask...@vanderbilt.edu> wrote:

> The other thing that is different about what I'm doing and what is being
> done by the other user who is not encountering this problem is that I'm
> authenticating directly by establishing a session when the script starts
> (lines 347-349).

As Brad said, you should use OAuth:
It won't help with the throttling, but it's simpler and more secure (and if
you do use PAWS, it should work without any setup).

Eventually I will probably apply for a bot flag, but I doubt that this bot
> will ever be autonomous, so is that really necessary?

While the ultimate authority on thi is always the community of the given
wiki (via its bureaucrats, or a bot approval committee in some cases), IMO
semi-autonomous tools are the ones where a human reviews every edit, which
does not seem to be the case here.
In any case, it is necessary if you want to make several dozen edits a
minute and are not in any other group which has the noratelimit right.

> Would it matter if I used my own account instead of a separate bot account?

Depends on the exact limit you've hit, but probably not unless you are an
administrator on that wiki.
Wikitech-l mailing list

Re: [Wikitech-l] Documentation/Examples on enabling localization in Gadgets

2019-11-27 Thread Gergő Tisza
On Tue, Nov 26, 2019 at 2:39 PM Egbe Eugene  wrote:

> Is there any documentation or Gadget I can have a quick look at yo be able
> to learn how to enable translation in gadgets?

The common but not too great approach is to have separate per-language JS
or JSON pages which put the messages into some object, create and
maintain the pages manually, and load the right one via AJAX (and
reimplement what's needed from mw.messages functionality - typically that's
just parameter substitution).

The richer gadget definition syntax enabled by the Gadget definition:
namespace [1] is a more performance and more translator-friendly but
heavier to set up solution: you can define the used messages there,
translate them via translatewiki and access them via mw.messages (without
any manual hacking of that object needed). Not sure how the setup would
look on the translatewiki side though; presumably there would have to be
some faux extension where the messages are defined.

IMO a nice lightweight setup would be via JSON message collections; that's
blocked on T156210 and (to a lesser extent) T198758 currently.

[2] https://phabricator.wikimedia.org/T156210 - Support translation of JSON
blobs in Translate
[3] https://phabricator.wikimedia.org/T198758 - Load .json configuration
files via ResourceLoaderWikiModule
Wikitech-l mailing list

Re: [Wikitech-l] Help on creating a new extension

2019-11-24 Thread Gergő Tisza
Hi Sohom!

On Fri, Nov 22, 2019 at 10:25 AM Sohom Datta  wrote:

> I'd like some guidance on how to implement the unfinished parts of the
> extension especially the integration with FlaggedRevs

I presume that means displaying stableness settings in a similar way to
page protection? Scribunto_LuaFlaggedRevsLibrary::getStabilitySettings

just that (to expose it to Lua code), you can imitate what it does.

and the addition of a log entry to a popup ( Details are there in the
> readme.md of the repo) .

There's LogEventList::showLogExtract amongst other things. Of course you'd
have to think about caching.

Also any feedback on my code would also be very good.

Not very likely to happen in email. Probably the best way is if you make a
pull request or Gerrit changeset containing all your code and ask for
review on that.

Also, it would be great if somebody could guide me on how to make this a
> official extension someday.

See https://www.mediawiki.org/wiki/Gerrit/New_repositories on how request a
repository / create an extension page. There isn't anything particularly
official about that, anyone can do it for any extension. See
https://www.mediawiki.org/wiki/Continuous_integration/Entry_points on how
to set up CI in Gerrit (although in practice it might be easier to just
copy it from another extension).

If you want the extension to be used in Wikimedia production, see
Wikitech-l mailing list

Re: [Wikitech-l] 503 Backend fetch failed

2019-10-01 Thread Gergő Tisza
503 Backend fetch failed is a generic error message you get when the
application server breaks so badly that it fails to even return an error
message (e.g. PHP fatal errors can do this), so Varnish has to make up its
own error message.
Wikitech-l mailing list

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-19 Thread Gergő Tisza
On Mon, Mar 18, 2019 at 3:01 PM Derk-Jan Hartman <
d.j.hartman+wmf...@gmail.com> wrote:

> Last year has seen a lot of focus on Technical Debt. WMF also has a core
> platform team now, which finally allows a more sustainable chipping away at
> some of the technical debt.

Yeah. Having tech debt is never great but what gets people concerned is
when it just grows and grows, and management dismisses concerns because it
is always more important to have the next feature out quickly. We used to
have a bit of that problem, but IMO there have been lots of positive
changes in the last two years or so, and there is now a credible
organization-wide effort now to get debt under control (mainly looking at
the Platform Evolution program here). Having the core platform team also
helped a lot, and in my impression some other teams that had in the past
focused on fast feature iteration have also been given more space to do
things right.
Wikitech-l mailing list

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-17 Thread Gergő Tisza
On Sat, Mar 16, 2019 at 5:37 PM Thomas Eugene Bishop <
thomasbis...@wenlin.com> wrote:

> A bug fix was provided years ago but never accepted or rejected. It’s the
> first and last MediaWiki bug ever assigned to me. I’ve just unassigned
> myself.
> https://phabricator.wikimedia.org/T149639https://phabricator.wikimedia.org/T149639
> In cases like this, remarks like “Because you did not fix these bugs” and
> “... anyone is free to pick it up and work on it ... No further response
> needed” miss the point. When a bug fix is provided, but nobody with
> authority to accept or reject it ever does so, that’s a failure on the part
> of those who have authority, not on the part of those who are able and
> willing to fix bugs. Sure, volunteers are “free” to waste their time!

The code review backlog is a genuine problem (I'd say it's in the top 3
problems we have, along with lack of good documentation, and
well-structured testable code). It's entirely unrelated to the task backlog
and the other topics in this thread, though.
There has been plenty of discussion on it and various attempts at
addressing (you can see some in T78768 [1], or in various Wikimedia
Developer Summit sessions such as T149639 [2]).
Unfortunately without much result so far, but the problem is definitely no
lack of awareness. (I'd argue that lack of organizational focus /
commitment *is* a problem, so making your voice heard in the various
planning processes would be helpful. wikitech-l is not a great place for
that, though.)

You need to use and share your authority more effectively, to “be bold”
> with accepting and rejecting bug fixes. Authorize more people to accept or
> reject bug fixes. Assign each proposed bug fix to one such person, starting
> with the oldest bugs. Then hold those people accountable. You don’t lack
> volunteers, you lack volunteers with authority.

Being able to accept bug fixes effectively means being able to deploy code
to Wikimedia production, which has security and robustness implications. So
there are some limits on how widely we can distribute that authority.
That said, we are probably more conservative than we should be, and
nominating new reviewers [3] is one of the more useful things one could do.

[1] https://phabricator.wikimedia.org/T78768
[2] https://phabricator.wikimedia.org/T149639
Wikitech-l mailing list

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-14 Thread Gergő Tisza
On Wed, Mar 13, 2019 at 3:02 PM Strainu  wrote:

> The main problem I see with the community wishlist is that it's a
> process beside the normal process, not part of it. The dedicated team
> takes 10 bugs and other developers another ~10. I think we would be
> much better off if each team at the WMF would also take the top ranked
> bug on their turf and solve it and bump the priority of all the other
> bugs by one (e.g. low->medium). One bug per year per team means at
> least 18 bugs (at least if [1] is up to date) or something similar.

Community Tech is seven people and they do ten wishlist requests a year.
(Granted, they do other things too, but the wishlist is their main focus.)
So you are proposing to reallocate on average 1-2 months per year for every
team to work on wishlist wishes. That's about two million dollars of donor
money. How confident are you that the wishlist is actually a good way of
estimating the impact of tasks, outside of the narrow field where editors
have personal experience (ie. editing tools)?

What a wonderful world that would be! Unfortunately, all too often I
> feel that objective measures (such as "+1" on bugs, duplicates etc.)
> have no effect on prioritization.

Leaving aside how objective those measures are, in my when the task is
related to a product owned by a team, they are aware and take it into
account. (Which does not necessarily mean they agree, of course.) A lot of
production components don't really have an owner, though. (Or only do to
the extent that there is someone who can be pulled away from their normal
job if the thing catches fire.) That's just the reality of running the
website we have with the capacity we have - the alternative would be
undeploying things or not starting new projects for a long time. The
Wikimedia movement happens to be in the middle of its strategic planning
process, so if you want to argue for either of these things, this is a good
time to do it. (Not a good place, though.)

- UploadWizard (2 with high priority, 40 with normal, a few dozens
> low, hundreds more untriaged): this is the project that got us out of
> the "overloading the lang parameter for customizing the uploader" era,
> the project that is used by millions of people every year, including
> during every photo contest

UploadWizard is not in active development currently. If you want to argue
that the Multimedia team should be reassigned to work on it (and drop the
Structured Data on Commons project), or some other rearrangement should be
made, that's a specific proposal that can be meaningfully discussed.
(Probably not here, though - that's a matter of movement strategy, not a
technical decision. So maybe better suited to wikimedia-l.)
Saying that some project should be picked up, without saying what should be
dropped to make space, is an easy thing to do. Not all that useful though.

(As an aside, I'd love to see more people self-organize to get more say in
how priorities are decided. If you look at the discussion pages on WMF
annual plans, movement strategy and so on, they do not give the impression
of a community that's seriously interested in its own future.)
Wikitech-l mailing list

Re: [Wikitech-l] Question to WMF: Backlog on bugs

2019-03-14 Thread Gergő Tisza
On Thu, Mar 14, 2019 at 4:36 AM John Erling Blad  wrote:

> Google had a problem with unfixed bugs, and they started identifying
> the involved developers each time the build was broken. That is pretty
> harsh, but what if devs somehow was named when their bugs were
> mentioned? What if there were some kind of public statistic? How would
> the devs react to being identified with a bug? Would they fix the bug,
> or just be mad about it? Devs at some of Googles teams got mad, but in
> the end the code were fixed. Take a look at "GTAC 2013 Keynote:
> Evolution from Quality Assurance to Test Engineering" [1]

Sorry to be direct but you seem to have little understanding of what you
are talking about. You are equivocating different things and shifting
goalposts every time you comment on this thread. You are jumping between
various positions involving "a large bug backlog is bad", "important bugs
are not getting prioritized accordingly", "the WMF should scale its
services down so it has the capacity to respond to every request" (ie. fire
some developers, hire more community liasions), and now you are talking
about broken builds. Every time someone challenges your claims, you just
switch to talking about another one. This is frustrating and a waste of
other people's time. Please try to pin down what you are trying to propose
before making the proposal.

For those unfamiliar with development processes, a broken build means the
application is not starting at all, which means other developers cannot
test their own work, which means the entire development process grinds to a
halt. That is a huge hit to productivity so software organizations usually
try hard to avoid it, even though it's an internal issue with no user
impact (well, other than the organization shipping less features / fixes in
the next release because developer time was spent less effectively).
The closest equivalent we have for that is continuous integration tests
broken by merged code (although that's less severe since it usually doesn't
stop the application from working). While I'm sure the handling of those
could be improved (incidentally, that's just happening, see
), it has nothing to do with the original topic of this thread, since it is
happening in the development cycle and not visible to users.

About backlogs in general, Chromium is probably the biggest
open-source Google repo; that has currently 940K tickets, 60K of which are
open, and another 50K have been auto-archived after a year of inactivity.
(As others have pointed out, having a huge backlog and ruthlessly closing
tasks that do not get on the roadmap are the only two realistic options,
and the latter does have its advantages, no one here seems to be in favor
of it.) We have 220K tasks in total, 40K of which are open, so that's in
the same ballpark (not that that comparing open task ratios is particularly
meaningful  - but it hopefully shows that Google's handling of the
completely unrelated build breaking issue does not magically make them have
zero bugs).

What if we could show information from the bugs in Phabricator in a
> "tracked" template at other wiki-projects, identifying the team
> responsible and perhaps even the dev assigned to the bug?

Users who are interested in that information would be spared the enormous
effort of pressing a button on the mouse, I guess?

> We say we don't want voting over bugs, but by saying that we refuse
> getting stats over how many users a specific bug hits, and because of
> that we don't get sufficient information (metrics) to make decisions
> about specific bugs. Some bugs (or missing features) although changes
> how users are doing specific things, how do we handle that?

How many people vote on a bug and how many people are hit by a bug are two
entirely different things, and most of the time it's hard to measure the
latter. Voting will be dominated by power users who are more engaged with
the development process, users who understand English, users who come from
a larger wiki community and can canvass better, etc. (And Phabricator
doesn't support voting anyway.)

> What if users could give a "this hits me too" from a "tracked"
> template. That would give a very simple metric on how important it is
> to fix a problem. To make this visible to the wiki-communities the
> special page could be sorted on this metric. Of course the devs would
> have completely different priorities, but this page would list the
> wiki-communities priorities.

Having a page which lists the priorities of wiki communities (more
realistically, one specific wiki community) would be very useful, IMO. As
others have pointed out, the reason no such list exists is that people are
spending their time complaining here, instead of building lists on their
wiki. (At which point they would quickly find out that actually getting a
consensus on priorities is a lot harder than complaining about 

Re: [Wikitech-l] 2019-01-09 Scrum of Scrums meeting notes

2019-01-11 Thread Gergő Tisza
On Fri, Jan 11, 2019 at 11:29 AM Pine W  wrote:

> [...] Are there any upcoming plans for systematic
> study or development of WMF-to-public communications processes from
> Audiences and Technology?

Not quite the same thing, but touches on communication:

There are a couple existing ways to learn about new developments, though.
There is Tech News (focused on immediate user-facing changes), quarterly
department checkins (focused on the big picture and progress of annual plan
goals and other large projects), most teams have a monthly or sometimes
weekly newsletter and/or on-wiki updates page, there are some regular
showcases (research and more recently language), and people write blog
posts about larger or more interesting developments on Phabricator and the
WMF blog. (The discoverability of all of these things could certainly be
improved.) And if you are sufficiently interested in a specific team, team
Phabricator boards are public.

On Fri, Jan 11, 2019 at 11:29 AM Pine W  wrote:

> On Fri, Jan 11, 2019, 9:31 AM Dan Garry (Deskana) 
> wrote:
> > On Wed, 9 Jan 2019 at 20:25, Pine W  wrote:
> >
> > > I would like to request that every Audiences and Technology team submit
> > > highlights of recent and upcoming activities for inclusion in every set
> > of
> > > SoS notes, even if no one personally attends the SoS meeting from a
> > > particular team, so that readers of these notes can keep better track
> of
> > > what is happening in the Audiences and Technology departments and so
> that
> > > readers can make adjustments to our own plans as needed.
> >
> >
> > Scrum of scrums meetings are intended to be a venue for development teams
> > to surface upcoming blockers and dependencies on other teams, so that
> teams
> > can better work together and not block each other. Scrum of scrums
> meetings
> > are not intended to be a forum for general announcements about activities
> > by end-users. These are very different use cases with different target
> > audiences.
> >
> > I understand your concerns about visibility of the actions inside the
> > Wikimedia Foundation. It's certainly difficult to see things from the
> > outside. That said, taking a meeting with a well-defined purpose and
> > objective, and expanding that objective to add an additional, quite
> > different use case, is not good practice; doing so may cause people to
> > disengage or lose focus, thereby meaning the original objective of the
> > meeting is no longer met.
> >
> > Some reading you might find useful:
> >
> >- https://www.agilealliance.org/glossary/scrum-of-scrums/
> >- https://www.scruminc.com/scrum-of-scrums/
> >
> > Dan
> Hi Dan,
> That is helpful. Perhaps the info I am seeking would be better communicated
> in a different way. I am reluctant to request a new communications process
> that would require nontrivial effort to start and to maintain if I am the
> only one who is interested. Are there any upcoming plans for systematic
> study or development of WMF-to-public communications processes from
> Audiences and Technology? If so, perhaps I could have a conversation with
> whomever will work on that communications effort.
> Pine
> ( https://meta.wikimedia.org/wiki/User:Pine )
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list

Re: [Wikitech-l] Please comment on the draft consultation for splitting the admin role

2018-06-12 Thread Gergő Tisza
On Tue, Jun 12, 2018 at 8:56 AM Federico Leva (Nemo) 

> Personally I'd like us to explore agnostic and non-invasive solutions.

Mandatory code review (especially with a required waiting time) and
mandatory reauthentication are far more invasive than removing JS editing
permissions from administrators who don't want them.
That's not to say they shouldn't be done (again, most of the things we
could do are complimentary, and pretty much anything we could do we should
do, given the crazy levels of risk involved), but they require more nuance
and experimentation.

The subdivision of permissions across more user groups relies on a
> number of assumptions which may not hold. For instance, on thousands of
> MediaWiki wikis there's only one sysop anyway.

I presume you are talking about non-Wikimedia wikis here, as Wikimedia has
less then a thousand wikis (and about half of them seem to do basically
zero Javascript editing and so wouldn't be inconvenienced by not having any
permissions to do so and having to call in crosswiki helpers, like they do
for vandalism). For small MediaWiki installations this change offers little
extra defense but they are not particularly interesting attack targets in
the first place. For large non-Wikimedia wikis the change will be helpful
the same way it is for Wikimedia.

Something I would like is the ability to "have" a permission, but only
> "activate" it for limited periods of time when needed. The activation
> could require some extra steps (e.g. inserting the password again). It
> could be logged to Special:Log, which people can then monitor as they
> wish. A separate countermeasure (other than deflag) could inhibit it.

I agree reauthenticating before using more powerful or dangerous
permissions is something to look into, and there is ongoing work on that
front. But again the UX implications are nontrivial (what happens if the
timer runs out while you are editing the page?) and again there is no
reason not to do both.
Wikitech-l mailing list

Re: [Wikitech-l] Please comment on the draft consultation for splitting the admin role

2018-06-12 Thread Gergő Tisza
On Tue, Jun 12, 2018 at 3:26 AM Nathan  wrote:

> Is the risk of an attacker taking over an account with CSS/JS edit
> permissions any more or less because that person knows how to use CSS/JS?

I tried to address this in the FAQ:
> * The number of accounts which can be used to compromise the site will be
drastically reduced. Less accounts that can serve as attack vectors means a
smaller chance chance of an account being vulnerable when the password
database of some third-party website gets compromised. A smaller number of
accounts is also easier to monitor for suspicious logins.
> * Beyond the mere numbers of accounts, it will remove the most vulnerable
accounts as attack vectors. Users who can write CSS/JS code probably have
better IT skills in general, and thus better password and system security

Can we make the
> edit right temporary, so someone can request it through a normal simple
> process, execute their edits, and then relinquish it? It can be a right
> that admins could grant to each other, as long as they can't gift it to
> themselves.

We can (with some work), and we should. The various ways to make deploying
malicious javascript harder are complimentary, and we should do them all.
Separating permissions just happens to be the easiest one.

I feel most people don't appreciate how *extremely* scary the current
situation is. The public backlash around the Seigenthaler affair was
sparked by Wikipedia carelessly causing harm to a single individual. It
would be child's play compared to what would happen if a few ten thousand
people had their bank accounts cleaned, or a few dozen opposition members
arrested by the secret police, or something like that, because Wikipedians
decided security improvements were not worth the effort of moving users
from one group to another.
Wikitech-l mailing list

[Wikitech-l] Please comment on the draft consultation for splitting the admin role

2018-06-11 Thread Gergő Tisza
Hi all,

per the discussion on Phabricator, I'd like to split the administrator
("sysop") user group into two parts - one which can edit sitewide CSS/JS,
and one which can not. You can find the details and detailed rationale in
the task:

To inform the editor communities, and to make sure we can accommodate their
needs, I plan to run a community consultation; I'll probably kick it off on
Friday and have it run for two weeks. You can find the draft here:

I would appreciate if folks who are knowledgeable about the use of CSS/JS
editing and user rights management in various parts of the community could
look at it and add their concerns or suggestion to the talk page (or
Phabricator if that's more appropriate). Suggestions for a better group
name are especially welcome.

(As I wrote it in the FAQ on the consultation page, I think making sure
that MediaWiki is secure and at the same time empowers its users falls
under the authority of the developer community, and so the normal code
review process is appropriate for this change. Thus the consultation is not
intended to be an RfC or other discussion/veto type process. If you
disagree about the change in general, please discuss that on Phabricator,
or the linked Gerrit patches.)

Wikitech-l mailing list

Re: [Wikitech-l] MediaWiki 1.28 is now LTS?

2017-07-14 Thread Gergő Tisza
On Fri, Jul 14, 2017 at 3:17 PM, Robert Vogel  wrote:

> according to [1] the MediaWiki 1.28.2 was a LTS release. On [2] I can not
> find a hint about this. Is that maybe a mistake?
> [1] https://www.mediawiki.org/w/index.php?title=Download=2478392
> [2] https://www.mediawiki.org/wiki/Version_lifecycle#Release_policy

Yes, probably a copy-paste error, fixed.
Wikitech-l mailing list

[Wikitech-l] Discussion on adding a CODE_OF_CONDUCT file to all Wikimedia repos

2017-06-09 Thread Gergő Tisza
Hi all,

the Wikimedia technical community has recently adopted a Code of Conduct.
You have probably heard more about it than you wanted to, but if you have
missed it somehow, you can read the related blog post [1].

We started adding a CODE_OF_CONDUCT file with a link to all repos (this is
a new convention for declaring what a project's code of conduct is,
promoted by Github), which resulted in a debate about whether that is the
right thing to do. If you are interested, please join the discussion on the
Phabricator task [2].

[1] https://blog.wikimedia.org/2017/06/08/wikimedia-code-of-conduct/
[2] https://phabricator.wikimedia.org/T165540
Wikitech-l mailing list

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Gergő Tisza
On Tue, May 2, 2017 at 7:10 PM, Mark Clements (HappyDog) <
gm...@kennel17.co.uk> wrote:

> I seem to recall that a long, long time ago MediaWiki was using UTF-8
> internally but storing the data in 'latin1' fields in MySQL.

Indeed. See $wgLegacyEncoding
 (and T128149

> I notice that there is now the option to use either 'utf8' or 'binary'
> columns (via the $wgDBmysql5 setting), and the default appears to be
> 'binary'.[1]

I've come across an old project which followed MediaWiki's lead (literally
> - it cites MediaWiki as the reason) and stores its UTF-8 data in latin1
> tables.  I need to upgrade it to a more modern data infrastructure, but I'm
> hesitant to simply switch to 'utf8' without understanding the reasons for
> this initial implementation decision.

utf8 uses three bytes per character (ie. BMP only) so it's not a good idea
to use it. utf8mb4 should work in theory. I think the only reason we don't
use it is inertia (compatibility problems with old MySQL versions; lack of
testing with MediaWiki; difficulty of migrating huge Wikimedia datasets).
Wikitech-l mailing list

Re: [Wikitech-l] Sitelink removal in Wikidata

2017-04-26 Thread Gergő Tisza
On Wed, Apr 26, 2017 at 3:27 PM, Amir Ladsgroup  wrote:

> The tool excludes edits that are marked as patrolled but you need to
> rollback (instead of undo/restore).

Most flagrev wikis do not use patrolling.
Wikitech-l mailing list

[Wikitech-l] [MediaWiki-announce] OAuth security update

2016-11-02 Thread Gergő Tisza
Hi all,

a minor security bug [1] has been fixed in the OAuth extension:
* a connected application could use the /identify endpoint to learn the
username of a user even if the application has been disabled.
* a connected application could use the /identify endpoint to learn the
username of a user even if the user was locked or blocked from login (this
could be problematic when OAuth is used for authentication, such as with
the OAuthAuthentication [2] extension).
The fix has been backported to all supported versions (those for MediaWiki
1.23, 1.26 and 1.27).


[1] https://phabricator.wikimedia.org/T148600
[2] https://www.mediawiki.org/wiki/Extension:OAuthAuthentication
MediaWiki announcements mailing list
To unsubscribe, go to:
Wikitech-l mailing list

Re: [Wikitech-l] Tools for dealing with citations of withdrawn academic journal articles

2015-09-17 Thread Gergő Tisza
On Tue, Aug 18, 2015 at 2:42 AM, Pine W  wrote:

> Is there any easy way to find all of citations of specified academic
> articles on Wikipedias in all languages, and the text that is supported by
> those references, so that the citations of questionable articles can be
> removed and the article texts can be quickly reviewed for possible changes
> or removal?

Not right now, but Aaron and James are working on it!
Wikitech-l mailing list

Re: [Wikitech-l] Yandex?

2015-07-02 Thread Gergő Tisza
On Thu, Jul 2, 2015 at 12:55 PM, Legoktm legoktm.wikipe...@gmail.com

 I am also interested in the answer to Nemo's question about whether this
 is the first piece of proprietary software ever entering use in the
 Wikimedia projects land?

Translatewiki has been using Yandex for a while (and used Google while it
was available, and a third one I can't remember right now).
Not sure if it counts as a Wikimedia project or not.

We also use MaxMind for geolocation, which is I believe free software using
a proprietary database.
Wikitech-l mailing list

Re: [Wikitech-l] Using , as decimal separator in #expr

2015-06-14 Thread Gergő Tisza
On Sun, Jun 14, 2015 at 3:58 PM, Jeroen De Dauw jeroended...@gmail.com

 I'm using the #expr parser function provided by the Parser Functions
 extension. I'd like it to use , as decimal separator, and it currently
 uses .. Is there a way to change this nicer than installing String
 Functions and doing a replace?

 Are you maybe looking for {{formatnum|R}}?
Wikitech-l mailing list

Re: [Wikitech-l] selenium browser testing proposal and prototype

2012-04-16 Thread Gergő Tisza
Chris McMahon cmcmahon at wikimedia.org writes:

 As QA Lead for WMF, one of the things I want to do is to create an
 institutional suite of automated cross-browser regression tests using
 Selenium.  I have two goals for this suite:  first, to make it attractive
 and convenient for the greater software testing community to both use the
 suite locally and also to contribute to it.  Second, to have the suite be a
 useful regression test tool within WMF, tied to Beta Labs and controlled by
 For various reasons, I think the best language for this project is Ruby.  I
 realize that is a controversial choice, and I would like to explain my
 reasoning.  First let me address what I think will be the most serious
 ** It's not PHP.
 As of today, PHP has no complete or authoritative implementation of
 selenium-webdriver/Selenium 2.0.  That situation is unlikely to change any
 time soon.  This leaves a choice of Ruby or Python.  For various reasons I
 think Ruby is the superior choice.

Not sure what counts as authoritative, but there are a number of fairly usable 
PHP implementations such as php-webdriver [1] from Facebook or phpunit-selenium 
[2] from the PHPUnit framework, both of which are non-complete but very easy to 
extend (and in practice, you don't use most Selenium commands anyway). Using 
of them is more troublesome than choosing a language in which there is a 
reference Selenium implementation, but on the other hand, you don't need to 
introduce another language, you can write the tests in a language all MediaWiki 
developers are comfortable with, and you leave open the option of reusing 
MediaWiki components in tests to handle setup/teardown of fixtures in a clean 

Also, my (admittedly very superficial) experience with BDD is that 
Cucumber/Gherkin is much better for acceptance testing than RSpec (which is 
suited for unit testing). Gherkin tests are clean, human-readable descriptions 
which are easier to read than program code, and can be easily understood by non-
developers (end users, QA people, managers) even if they have no idea what a 
programming language is. On the other hand Gherkin is not a real programming 
language, so you lose some flexibility (such as the ability to use page 
objects), but IMO it is well worth it. And while RSpec relies on Ruby's elegant 
but obscure poetry mode, and thus cannot be easily copied in other languages, 
Gherkin has a simple custom syntax which is trivial to implement in any 
language; specifically, there is a good PHP implementation called Behat [3] 
which has its own Selenium implementation (Mink [4]) but also can be used with 
any other Selenium library. 

Mink has the additional advantage that it abstracts away the Selenium interface 
so that Selenium can be replaced with some other browser simulator without 
changing the tests; while that doing Selenium-specific things more complicated, 
it can yield huge speedups for test which don't require Javascript and so 
Selenium can be replaced with some simple browser emulator. (Yes, Selenium2 has 
its own browser emulator, but it is still a fair bit slower than something like 
Goutte [5]).

So maybe a PHP - Mink (or other Selenium library) - Behat stack instead of a 
Ruby - Watir - RSpec stack would be worth considering.

[1] https://github.com/facebook/php-webdriver
[2] https://github.com/sebastianbergmann/phpunit-selenium
[3] http://behat.org/
[4] http://mink.behat.org/
[5] https://github.com/fabpot/Goutte

Wikitech-l mailing list