[Wikitech-l] Re: MediaWiki 1.42-alpha will be branched as a beta on 9 April 2024

2024-04-08 Thread Arlo Breault


> On Apr 8, 2024, at 10:42 PM, James Forrester  wrote:
> 
> This is now done. This is the first step in the release process for
> MediaWiki 1.42.0, which should be out in May 2024, approximately six
> months after MediaWiki 1.42.0.

six months after MediaWiki 1.41.0
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Parser media HTML changes coming to group2 wikis

2023-03-09 Thread Arlo Breault
Hello all,

We have [written previously][0] about upcoming changes to how media is 
structured in the HTML output of MediaWiki's wikitext parser.  You can read 
more about it in [the FAQ][1].

Over the past many months, we have been [gradually rolling out these 
changes][2], first to test wikis, then to a number of early adopter wikis that 
opted-in to help us identify and resolve issues (thank you for that), and 
finally to all group1 wikis.

In the coming weeks, we plan on enabling the changes on group2 wikis.  This 
will be included in Tech News.

We have identified a number of gadgets, user scripts, and 
MediaWiki:Common.(css|js) pages that required forward compatible changes, and 
have worked with interface administrators to get those [changes applied 
beforehand][3][4][5].

However, should you notice any visual differences in how media is displayed or 
scripts that are no longer working, please file tasks in Phabricator with the 
[Parsoid-Read-Views (Phase 0 - Parsoid-Media-Structure)][6] project tag.

Thanks,
The Content Transform Team


[0]: 
https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/L2UQJRHTFK5YG3IOZEC7JSLH2ZQNZRVU/
[1]: 
https://www.mediawiki.org/wiki/Parsoid/Parser_Unification/Media_structure/FAQ
[2]: https://phabricator.wikimedia.org/T314318
[3]: https://phabricator.wikimedia.org/T271114
[4]: https://phabricator.wikimedia.org/T297447
[5]: https://guc.toolforge.org/?by=date=ABreault+%28WMF%29
[6]: https://phabricator.wikimedia.org/project/view/5533/
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/


[Wikitech-l] Upcoming parser HTML changes for media: Test your scripts, gadgets, bots, extensions

2021-07-13 Thread Arlo Breault
Hello all,

We'd like to inform you of a change coming in how media is structured in the 
parser's HTML output.  It has been [in the works for quite some time][1].  The 
new structure was prototyped in Parsoid's output since its inception and 
outlined in [its specification][2].

The proposed change has gone through the [RFC process][3] and an implementation 
to output this new structure in MediaWiki's core parser was [recently 
merged][4], gated behind a flag.  So far, it has been enabled on testwiki and 
testwiki2.

There are [a number of known issues][5] but we don't expect to see many 
rendering differences since we've done some [extensive visual diff testing][6]. 
 Templates won't be impacted; the old CSS styles will remain, for now.

However, where we do expect work is needed is with code interacting with the 
page, be it user scripts, gadgets, extensions, bots, or other things.

If you'd like to help us out and get ahead of the changes before they have the 
potential to interfere with your workflow, please visit these wikis and test 
them out.  You can file tasks in Phabricator with the Parsoid-Media-Structure 
project tag.

Thanks,
The Parsing Team


[1]: https://www.mediawiki.org/wiki/Parsing/Media_structure
[2]: https://www.mediawiki.org/wiki/Specs/HTML/2.2.0#Media
[3]: https://phabricator.wikimedia.org/T118517
[4]: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/507512
[5]: https://phabricator.wikimedia.org/project/board/5428/
[6]: https://phabricator.wikimedia.org/T266149
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/


Re: [Wikitech-l] leading space and tag

2019-07-22 Thread Arlo Breault


> On Jul 22, 2019, at 5:11 AM, Sergey F  wrote:
> 
> test2
>  test3
> 
> 
> The result of conversion is:
> 
> test2
> test3
> 

Yes, this looks like a bug

See https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/524811

Thanks


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] coc ban

2018-08-08 Thread Arlo Breault


> On Aug 8, 2018, at 1:43 PM, Saint Johann  wrote:
> 
> Code of conduct is important to be enforced, but, in my opinion, there should 
> be a difference in how it’s enforced. To volunteers that help the movement, 
> there should be no unacceptable language, as it is a way (and a purpose of 
> something like code of conduct) to make MediaWiki development spaces more 
> welcoming to future volunteers.

Is it not possible that one volunteer's language discourages
other volunteers from participating, regardless of who it's
directed at?


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] coc ban

2018-08-08 Thread Arlo Breault


> On Aug 8, 2018, at 9:42 AM, Saint Johann  wrote:
> 
> especially when said to Wikimedia employees as opposed to volunteers.)

Can you elaborate on that?


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-13 Thread Arlo Breault

> On Jul 13, 2017, at 10:35 AM, Subramanya Sastry  wrote:
> 
> (2) There is a Parsoid bug in detection of self-closing tags where presence 
> of a "/>" in an HTML attribute triggers a false positive. This has been 
> reported previously ... so I suppose it is not as uncommon as I thought. 
> We'll take a look at that.

No, Parsoid is doing that by design to match the php parser.

See T97157 and https://phabricator.wikimedia.org/T170582#3435855


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tidy will be replaced by RemexHTML on Wikimedia wikis latest by June 2018

2017-07-11 Thread Arlo Breault

> On Jul 11, 2017, at 6:13 AM, Nicolas Vervelle  wrote:
> 
>   - Where is it possible to change the description displayed in each page
>   dedicated to a category ?

https://phabricator.wikimedia.org/source/mediawiki-extensions-Linter/browse/master/i18n/fr.json;6fc72c808136676da0302d98601bd4662a6b8022$37


> For example, the page for self-closed-ags [2] is
>   very short. It would be nice to be able to add a description of what the
>   error is, what problems it can cause and what are the solutions to fix it
>   (or to be able to link to a page explaining all that).

In the top right corner, there's a link to "Aide"

https://www.mediawiki.org/wiki/Help:Extension:Linter/self-closed-tag


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] 2017-02-15 Scrum of Scrums meeting notes

2017-02-15 Thread Arlo Breault

> On Feb 15, 2017, at 8:01 PM, Pine W  wrote:
> 
> I'm happy to see "Starting to work on audio/video support in Parsoid (VE
> will follow)". Any ETA on this? Links to Phab tickets would be appreciated
> as well.

Please see, https://phabricator.wikimedia.org/T64270#2981630


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Loosing the history of our projects to bitrot. Was: Acquiring list of templates including external links

2016-08-03 Thread Arlo Breault
https://www.mediawiki.org/wiki/Specs/wikitext/1.0.0


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid Exception HTTP 500

2016-08-03 Thread Arlo Breault

>> Can you think of anything specific in your setup that
>> might be preventing that?
> In that case I think there could be something. I cannot start the parsoid 
> server with "service parsoid start",

What happens when you try to do that?


> so I must do it manuelly with nodejs and maybe thats the issue for it.

Are you passing it the config you've been working on?

`node server.js --config /path/to/localsettings.js`


>> If so, I'd suggest this change,
>> 
>> parsoidConfig.setMwApi({
>>   prefix: 'localhost',
>>   domain: 'localhost',
>>   uri: 'http://127.0.0.1/wiki/w/api.php'
>> });
>> 
>> (Note that we removed the first string argument there.)
> If I change my settings in that way i get this error:
> [fatal/request][localhost/v3/page/html/Main_Page/3] Did not find page 
> revisions for V3/page/html/Main_Page/3

Well, that's odd.  Do you have multiple MediaWiki instances
on that machine?

I'd suggest taking VE out of the equation and
just checking the Parsoid API for starters.

Your config says,

```
parsoidConfig.serverPort = 8000;
parsoidConfig.serverInterface = '0.0.0.0';
```

so, first ensure that Parsoid is running at,
http://localhost:8000/

then try visiting,
http://localhost:8000/localhost/v3/page/html/Main_Page

and confirming that it is parsing the wiki's main page
as expected.

---

Also, I'm not sure what the etiquette of this list is,
but since this is unlikely to be relevant to other
people here, maybe we should move the discussion off list.

Feel free to contact me or the Parsing team directly,
https://www.mediawiki.org/wiki/Parsoid#Contacting_us


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid Exception HTTP 500

2016-07-29 Thread Arlo Breault

> On Jul 29, 2016, at 12:24 PM, Julian Loferer  wrote:
> 
> Yeah here is my localsettings.js file:
> 
> https://phabricator.wikimedia.org/P3603

Thanks!

> And i have installed it as an ubuntu package so with apt-get install parsoid .

I'm assuming you have v0.5.1 then.  Correct me if I'm wrong.


The fact that you got this log line,

[warning/api/econnrefused][localhost/v3/page/html/Main_Page/3] Failed API 
request, 
{"error":{"code":"ECONNREFUSED","errno":"ECONNREFUSED","syscall":"connect"},"retries-remaining":0}

means that your VE is probably setup correctly.
It is at least communicating with Parsoid.

ECONNREFUSED is problematic though.
It means that Parsoid can't connect to http://192.168.0.102/wiki/w/api.php

Can you think of anything specific in your setup that
might be preventing that?

Also, please confirm that http://192.168.0.102/wiki/w/api.php,
with that entire path, is what you checked previously
for the response of the Action API.

Is Parsoid on the same machine as the process running Mediawiki?

If so, I'd suggest this change,

parsoidConfig.setMwApi({
  prefix: 'localhost',
  domain: 'localhost',
  uri: 'http://127.0.0.1/wiki/w/api.php'
});

(Note that we removed the first string argument there.)


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid Exception HTTP 500

2016-07-29 Thread Arlo Breault

> On Jul 29, 2016, at 10:07 AM, Julian Loferer  wrote:
> 
> Yeah it looks similar. The link direct me to the right page.

Can you paste your localsettings.js file somewhere for us to
take a look?
https://phabricator.wikimedia.org/paste/edit/form/14/

Also, in case I missed it, how did you install Parsoid?


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Parsoid Exception HTTP 500

2016-07-28 Thread Arlo Breault

> On Jul 28, 2016, at 9:03 AM, Julian Loferer  wrote:
> 
> I looked into my localsettings.js and didnt found a line called setInterwiki 
> I only found setMwApi? Is it the same or should i add setInterwiki for my own?

setMwApi is the newer name for that function
and should be fine.

Can you first confirm that your Mediawiki action API
is returning the right thing?  It should look something
like https://en.wikipedia.org/w/api.php


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Let's make parsoid i18n great again

2016-03-31 Thread Arlo Breault

> And there is a consensus for English being bad choice for RTL languages as
> it cause mixed directional content which should be avoided. So if we go
> with 1 choice, RTL languages should be exception.

See https://gerrit.wikimedia.org/r/#/c/280792/


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Issue with separate webserver port for Parsoid

2016-02-19 Thread Arlo Breault

> Parsoid and VisualEditor are working great with this setup for everything
> except images. When I first add an image (VE --> Insert --> Media) it works
> as expected. The image displays and is configurable. When I save the page
> everything functions properly. However, when I click edit again the image
> does not show. Shortly thereafter the request for the image times out and I
> get the following error in the browser console:
> 
> GET 
> http://:9000/bme/img_auth.php/thumb/6/66/BME_sign.jpg/400px-BME_sign.jpg
> net::ERR_CONNECTION_TIMED_OUT
> 
> Note that it's attempting to load the image over port 9000, not 443.
> 
> Is there a way to tell images to load over the standard entry point?

Parsoid should be producing links relative to the base element,
https://github.com/wikimedia/parsoid/blob/master/lib/wt2html/DOMPostProcessor.js#L321
which, as I understood, VE on the page ignores.

Can you provide the wikitext that got saved and the subsequent
html it's producing?

You might want to forgo the suggestion from the troubleshooting page
and just set `parsoidConfig.strictSSL = false;`,
https://github.com/wikimedia/parsoid/blob/master/localsettings.js.example#L100-L102


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] global cleanup of nowiki

2015-06-30 Thread Arlo Breault
On Sunday, June 21, 2015 at 11:43 AM, Amir E. Aharoni wrote:
 Thanks Arlo. I added a few.
  
 But I'm not sure that it answers my original question: Will this be done
 every time a page happens to edited in VE and saved or will it be done
 globally on all pages in all wikis as some kind of a maintenance job?

Oh, sorry if I wasn’t clear.

The normalizations we’re adding will be applied to new content
added through VE. A global cleanup of all the past unnecessary
nowiki’ing is still necessary and desired.

  
  
 --
 Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
 http://aharoni.wordpress.com
 ‪“We're living in pieces,
 I want to live in peace.” – T. Moore‬
  
 2015-06-20 19:45 GMT+03:00 Arlo Breault abrea...@wikimedia.org 
 (mailto:abrea...@wikimedia.org):
  
  On Friday, June 19, 2015 at 1:38 AM, Amir E. Aharoni wrote:
   There may be more - I'm still looking for these.
   
   
   
   
  If you find any, please propose them on the Parsoid’s normalization talk
  page [0].
  I’ve added the ones you’ve mentioned so far.
   
  We’ve documented [1] what’s currently been implemented.
   
  A few months back, Subbu solicited feedback [2] on what style norms should
  be enforced. We’ve since added a `scrubWikitext` parameter to Parsoid’s API
  that clients (like VE) can benefit from.
   
  Cleaning up our past transgressions is great. Helping to prevent their
  continued
  existence is even better.
   
  I was reading the discussion on gradually enabling VE for new accounts [3]
  and
  Kww writes there,
   
  Further, we still have issues with stray nowiki tags being scattered
  across articles.
  Until those are addressed, the notion that VE doesn't cause extra work for
  experienced editors is simply a sign that the metrics used to analyze
  effort were
  wrong. Jdforrester, can you explain how a study that was intended to
  measure
  whether VE caused extra work failed to note that even with the current
  limited use,
  it corrupts articles at this kind of volume [4]? Why would we want to
  encourage
  such a thing?”
   
  Makes me sad.
   
   
  [0] https://www.mediawiki.org/wiki/Talk:Parsoid/Normalizations
  [1] https://www.mediawiki.org/wiki/Parsoid/Normalizations
  [2]
  https://lists.wikimedia.org/pipermail/wikitech-l/2015-April/081453.html
  [3]
  https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28proposals%29#Gradually_enabling_VisualEditor_for_new_accounts
  [4]
  https://en.wikipedia.org/w/index.php?title=Special:AbuseLogoffset=limit=500wpSearchFilter=550
   
   
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
  
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] global cleanup of nowiki

2015-06-20 Thread Arlo Breault
On Friday, June 19, 2015 at 1:38 AM, Amir E. Aharoni wrote:
 There may be more - I'm still looking for these.


If you find any, please propose them on the Parsoid’s normalization talk page 
[0].
I’ve added the ones you’ve mentioned so far.

We’ve documented [1] what’s currently been implemented.

A few months back, Subbu solicited feedback [2] on what style norms should
be enforced. We’ve since added a `scrubWikitext` parameter to Parsoid’s API
that clients (like VE) can benefit from.

Cleaning up our past transgressions is great. Helping to prevent their continued
existence is even better.

I was reading the discussion on gradually enabling VE for new accounts [3] and
Kww writes there,

Further, we still have issues with stray nowiki tags being scattered across 
articles.
Until those are addressed, the notion that VE doesn't cause extra work for
experienced editors is simply a sign that the metrics used to analyze effort 
were
wrong. Jdforrester, can you explain how a study that was intended to measure
whether VE caused extra work failed to note that even with the current limited 
use,
it corrupts articles at this kind of volume [4]? Why would we want to encourage
such a thing?”

Makes me sad.


[0] https://www.mediawiki.org/wiki/Talk:Parsoid/Normalizations
[1] https://www.mediawiki.org/wiki/Parsoid/Normalizations
[2] https://lists.wikimedia.org/pipermail/wikitech-l/2015-April/081453.html
[3] 
https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28proposals%29#Gradually_enabling_VisualEditor_for_new_accounts
[4] 
https://en.wikipedia.org/w/index.php?title=Special:AbuseLogoffset=limit=500wpSearchFilter=550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor proxy with blinded tokens

2015-03-16 Thread Arlo Breault
I share Risker’s concerns here and limiting the anonymity
set to the intersection of Tor users and established wiki
contributors seems problematic. Also, the bootstrapping
issue needs working out and relegating Tor users to second
class citizens that need to edit through a proxy seems less
than ideal (though the specifics of that are unclear to me).

But, at a minimum, this seems like a useful exercise to
run if only for the experimental results and to show good faith.

I’m more than willing to help out. Please get in touch.

Arlo




On Wednesday, March 11, 2015 at 9:10 AM, Chris Steipp wrote:

 On Mar 11, 2015 2:23 AM, Gergo Tisza gti...@wikimedia.org 
 (mailto:gti...@wikimedia.org) wrote:
   
  On Tue, Mar 10, 2015 at 5:40 PM, Chris Steipp cste...@wikimedia.org 
  (mailto:cste...@wikimedia.org)
 wrote:
   
   I'm actually envisioning that the user would edit through the third
 party's
   proxy (via OAuth, linked to the new, Special Account), so no special
   permissions are needed by the Special Account, and a standard block on
   that username can prevent them from editing. Additionally, revoking the
   OAuth token of the proxy itself would stop all editing by this process,
   
  
  
 so
   there's a quick way to pull the plug if it looks like the edits are
   predominantly unproductive.
   
   
   
  I'm probably missing the point here but how is this better than a plain
  edit proxy, available as a Tor hidden service, which a 3rd party can set
  
  
 up
  at any time without the need to coordinate with us (apart from getting an
  OAuth key)? Since the user connects to them via Tor, they would not learn
  any private information; they could be authorized to edit via normal OAuth
  web flow (that is not blocked from a Tor IP); the edit would seemingly
  
  
 come
  from the IP address of the proxy so it would not be subject to Tor
  
  
 blocking.
  
  
  
 Setting up a proxy like this is definitely an option I've considered. As I
 did, I couldn't think of a good way to limit the types of accounts that
 used it, or come up with an acceptable collateral I could keep from the
 user, that would prevent enough spammers to keep it from being blocked
 while being open to people who needed it. The blinded token approach lets
 the proxy rely on a trusted assertion about the identity, by the people who
 it will impact if they get it wrong. That seemed like a good thing to me.
  
 However, we could substitute the entire blinding process with a public page
 that the proxy posts to that says, this user wants to use tor to edit,
 vote yes or no and we'll allow them based on your opinion. And the proxy
 only allows tor editing by users with a passing vote.
  
 That might be more palatable for enwiki's socking policy, with the risk
 that if the user's IP has ever been revealed before (even if they went
 through the effort of getting it deleted), there is still data to link them
 to their real identity. The blinding breaks that correlation. But maybe a
 more likely first step to actually getting tor edits?
  
 ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
  
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Urlencoding strip markers

2015-02-09 Thread Arlo Breault
On Tuesday, February 3, 2015 at 10:24 AM, Brion Vibber wrote:
 Special page inclusions shouldn't be able to do anything privileged;
 they're meant for public data. If that's not being enforced right now I'd
 recommend reworking or killing the special page inclusion system...

Ok, although Brion's idea preserves more of the original content, these
larger security concerns don’t look like they are going to be resolved
in short order.

I think the pragmatic thing to do is either drop the content and raise
an error, or replace the content with a warning string as Gergo suggested.

Any takers?
  
  
 -- brion
 On Feb 3, 2015 10:11 AM, Brad Jorsch (Anomie) bjor...@wikimedia.org 
 (mailto:bjor...@wikimedia.org)
 wrote:
  
  On Fri, Jan 30, 2015 at 4:04 PM, Brion Vibber bvib...@wikimedia.org 
  (mailto:bvib...@wikimedia.org)
  wrote:
   
   On Fri, Jan 30, 2015 at 12:11 PM, Jackmcbarn jackmcb...@gmail.com 
   (mailto:jackmcb...@gmail.com)
  wrote:
On Fri, Jan 30, 2015 at 2:02 PM, Brion Vibber bvib...@wikimedia.org 
(mailto:bvib...@wikimedia.org)
wrote:
 I'd be inclined to unstrip the marker *and squash HTML to plaintext*,
 
 
then
 encode the plaintext...
 
 
 
I don't see how that addresses the security issue.

   Rollback tokens in the Special:Contributions HTML would then not be
   available in the squashed text that got encoded. Thus it could not be
   extracted and used in the timing attack.
   
   
   
  While it would avoid *this* bug, it would still allow the attack if there
  is ever sensitive data on some transcludable special page that isn't
  embedded in HTML tag attributes.
   
   
  --
  Brad Jorsch (Anomie)
  Software Engineer
  Wikimedia Foundation
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
  
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Urlencoding strip markers

2015-02-03 Thread Arlo Breault
On Friday, January 30, 2015 at 1:04 PM, Brion Vibber wrote:
 On Fri, Jan 30, 2015 at 12:11 PM, Jackmcbarn jackmcb...@gmail.com 
 (mailto:jackmcb...@gmail.com) wrote:
  
  On Fri, Jan 30, 2015 at 2:02 PM, Brion Vibber bvib...@wikimedia.org 
  (mailto:bvib...@wikimedia.org)
  wrote:
   
   On Thu, Jan 29, 2015 at 5:38 PM, Brad Jorsch (Anomie) 
   bjor...@wikimedia.org (mailto:bjor...@wikimedia.org)
wrote:



On Thu, Jan 29, 2015 at 2:47 PM, Arlo Breault abrea...@wikimedia.org 
(mailto:abrea...@wikimedia.org)
wrote:
 https://gerrit.wikimedia.org/r/#/c/181519/
 
 
 
To clarify, the possible solutions seem to be:
 
1. Unstrip the marker and then encode the content. This is a security
   hole
(T73167)



   I'd be inclined to unstrip the marker *and squash HTML to plaintext*,
  then
   encode the plaintext...
   
   
   
  I don't see how that addresses the security issue.
  
 Rollback tokens in the Special:Contributions HTML would then not be
 available in the squashed text that got encoded. Thus it could not be
 extracted and used in the timing attack.

Is this what you mean by “squash HTML to plaintext”?
urlencode( strip_tags( $parser-mStripState-unstripBoth( $s ) ) );

Is strip_tags reliable enough to not get confused and leave those
tokens lying around?

  
  
 -- brion
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Urlencoding strip markers

2015-01-29 Thread Arlo Breault
Currently, while {{urlencod}}ing, content in strip markers is skipped.

I believe this violates the expectation that the entire output
will be properly escaped to be placed in a sensitive context.

An example is in the infobox book caption on,
https://en.wikipedia.org/wiki/%22F%22_Is_for_Fugitive

There’s a brief discussions of the security implications of
some proposed solutions in the review of,
https://gerrit.wikimedia.org/r/#/c/181519/

It seems best (I guess) to just drop the content (`killMarkers()`).

Any opinions or better ideas?

Thanks,
Arlo



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor and Anonymous Users (I know, we've had this discussion a million times)

2014-10-12 Thread Arlo Breault
Thanks for initiating the conversation Derric. I've tried to put together a
proposal addressing the general problem of allowing edits from a proxy.
Feedback is appreciated.

Proposal:

* Require an account to edit via proxy.

* Allow creating accounts from proxies but globally rate limit account creations
  from all proxies (to once per five mins? or some data driven number that makes
  sense).

* Tag any edits made through a proxy as such and put them in a queue.

* Limit the amount of edits in that queue per account (to one? again, look at
  the data).

* Apply a first pass of abuse filtering on those edits before notifying a human
  of their presence to approve.

* Rate limit global proxy edits per second to something manageable (see data)

This limits the amount of backlog work a single user can create to how many
captchas they can solve / accounts they can create. But I think it's enough a
deterrent in that 1) their edits aren't immediately visible, 2) if they're
abusive, they won't show up on the site at all, and 3) it forces the act to
premeditated creation of accounts which can be associated at the time of an
attack and deleted together.

Rate limiting account creation seems to open a DOS vector but combining
that with the captcha hopefully helps.

Attribution / Licensing:

As a consequence of requiring an account to edit via proxy, we avoid the
issue of attributing edits to a shared IP.

Sybil attack:

Or, as it's called around here, sockpuppeting. CheckUser would presumably
provide less useful information but the edit history of the accounts would
still lend themselves to the same sorts of behavioural evidence gathering
that is undertaken at present.

Class system:

This makes a set of users concerned about their security and privacy trade off
some usability but that seems acceptable.

A reputation threshold for proxy users can be introduced. After a substantial
amount of edits and enough time has lapsed, the above edit restrictions can be
lifted from an account. Admins would still have recourse to block/suspend the
account if it becomes abusive.

Blacklisting:

Anonymous credential systems (like Nymble) are interesting research directions
but the appropriate collateral to use is still unsolved.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor and Anonymous Users (I know, we've had this discussion a million times)

2014-10-12 Thread Arlo Breault
 Unless there is further discussion to be had on a new *technical* solution
 to Tor users, this is the wrong mailing list to be making these proposals.
 At the very least take it to the main wikimedia list, or on-wiki, where
 this is a lot more relevant.

Thanks Tyler. I kept the discussion going here because it sounded above
like Derric may already be in the process of doing that and I wanted to keep
a unified voice there.

Although my suggestion is similar in kind to what had already been proposed,
the main object to it was that it would create too much work for our
already constrained resources. The addition of rate limiting is a technical
solution that may or may not be feasible.

The people on this list can best answer that.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor and Anonymous Users (I know, we've had this discussion a million times)

2014-10-12 Thread Arlo Breault
On Sunday, October 12, 2014 at 4:45 PM, Marc A. Pelletier wrote:
 On 10/12/2014 12:50 PM, Arlo Breault wrote:
  The people on this list can best answer that.
  
  
 What the people on this list cannot answer is /whether/ and under what
 conditions it would desirable to allow proxy editing in the first place.
  
The “that” I was referring to was whether the rate limiting,  
as I described it above, was technically feasible.

Sorry if that wasn’t clear.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Reviewing a couple of TorBlock patches

2014-08-23 Thread Arlo Breault
I can help a bit with review. I maintain the check[0] service for the Tor
project.

[0] https://check.torproject.org/


On Wed, Jul 23, 2014 at 10:16 PM, Legoktm legoktm.wikipe...@gmail.com
wrote:

 On 7/23/14, 4:56 AM, Quim Gil wrote:
  According to our algorithm (*), TorBlock currently has the worse track
  reviewing code contributions -- even after Tim gave a -1 to one of the
  three open patches last week (thanks!). There are two patches from Tyler
  that haven't received any feedback at all since August 2013.
 
 
 https://gerrit.wikimedia.org/r/#/q/status:open+project:mediawiki/extensions/TorBlock,n,z

 I left comments on all the patches. The -1's are mainly due to rebasing
 needed.

  Your help reviewing these patches is welcome.
 
  It is not surprising that this extension has no maintaner listed at
  https://www.mediawiki.org/wiki/Developers/Maintainers (someone suggested
  Tim in that table, he disagrees and edited accordingly).
 
  Also, maybe someone is interested in maintaining this extension? Only
  eleven patches submitted in the last 15 months.

 Probably not since I don't really know that much about tor, but I'm
 willing to review Tyler's patches if he's unable to find other people.

 -- Legoktm

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Arlo Breault joins Wikimedia as Features Engineer

2014-08-23 Thread Arlo Breault
Thanks Pine. I joined up.


On Mon, Aug 18, 2014 at 12:01 PM, Pine W wiki.p...@gmail.com wrote:

 Arlo,

 That is a very professional picture.

 I see that you live in British Columbia. May I invite you to join the
 illustrious  ranks of the Cascadia Wikimedians group and sign up for our
 email list? We are hoping for formal approval of our group from Affcom
 shortly.

 Email list
 https://lists.wikimedia.org/mailman/listinfo/wikimedia-cascadia

 Meta talk
 https://meta.m.wikimedia.org/wiki/Talk:Cascadia_Wikimedians

 Pine
 On Aug 18, 2014 11:39 AM, Terry Chay tc...@wikimedia.org wrote:

  Hello everyone,
 
  It’s with great pleasure that I’m announcing that Arlo Breault joined the
  Wikimedia Foundation as a Features Engineer. Note the past tense. :-D
 
  Before joining us, Arlo worked as an independent open-source software
  developer. He’s held various contracts at the Tor Project[1],
  DuckDuckGo[2], Storify[3], and Right  Democracy[4], amongst others,
 where
  he’s worked on everything from novel censorship circumvention systems to
  navigable visualizations of the web graph.
 
  I first found out about him in April of 2012, but we only managed to find
  a fit a year ago in July of 2013 as an international contractor. His
 first
  official day as a member of our staff was July 7, 2014. I tell myself
 that
  announcing Arlo and Marc together is the reason I’ve been tardy on these
  announcements. :-D Along with Marcoil (previous e-mail), the two of them
  work with Subbu Sastry and C. Scott Ananian to form the Parsoid team,
 which
  provides the back-end voodoo that turns your VisualEditing into wikitext
  and back again.[5]
 
  Arlo studied physics and mathematics at McGill in Montréal, and now lives
  in Victoria, BC. He has been spending a lot of his free time lately
 looking
  at cryptographic protocols, contributing OTR.js[6] to Cryptocat[7], a
  privacy preserving chat application. He’s otherwise typically Canadian,
  enjoying his maple syrup, hockey, and reading by the warm glow of a fire.
 
  Please join me in a belated welcome of Arlo to the Wikimedia Foundation.
  :-)
 
  Take care,
  Terry
 
  P.S. In keeping with Jared’s demand of a picture to accompany every new
  hire announcement, here is one:
  https://avatars0.githubusercontent.com/u/123708?v=2s=400
 
  [1] https://www.torproject.org/
  [2] https://duckduckgo.com/
  [3] https://storify.com/
  [4]
 
 https://en.wikipedia.org/wiki/International_Centre_for_Human_Rights_and_Democratic_Development
  [5]
 
 https://blog.wikimedia.org/2013/03/04/parsoid-how-wikipedia-catches-up-with-the-web/
  [6] https://github.com/arlolra/otr
  [7] https://crypto.cat/
 
  terry chay  최태리
  Director of Features Engineering
  Wikimedia Foundation
  “Imagine a world in which every single human being can freely share in
 the
  sum of all knowledge. That's our commitment.”
 
  p: +1 (415) 839-6885 x6832
  m: +1 (408) 480-8902
  e: tc...@wikimedia.org
  i: http://terrychay.com/
  w: http://meta.wikimedia.org/wiki/User:Tychay
  aim: terrychay
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Wikimedia-l] Options for the German Wikipedia

2014-08-13 Thread Arlo Breault
On Monday, August 11, 2014 at 5:18 AM, svetlana wrote:
 On Mon, 11 Aug 2014, at 21:42, Chris Keating wrote:

   I think the most helpful thing would be to not attempt to start wars, and
   particularly not on behalf of anyone or against individuals. We are all on
   the same side here: trying to make the projects (and the project
   interfaces, as a part of that) better. That includes, for instance, trying
   out a new way of viewing photographs.

   I assume of course and as always that you send your message from a place 
   of
   also wanting the projects to be better and more usable. But it is hard to
   see how anything you suggest above gets us there.

   
   
   
  I agree with everything Phoebe's said.
  
  
  
 That includes, for instance, trying out a new way of viewing photographs.
 do you guys try out on the whole userbase?
 that's not how people try things
 it's not what actually happened wither
  
 maybe say something more like hi people, in the background we are writing a 
 lot of wonderful code which will be used for refreshing the entire website in 
 the long term
  
 we're especially looking at how we fail to match project mission - we're 
 people, we are making mistakes!
  
 we're adding edit interface to media viewer ASAP and let everything else 
 burn until we do that
  
 etc etc
  
 and don't shy out, you ARE empowering the community already ;)
  
 including jquery into list of what gadgets can use is already a huge plus, 
 but i barely know any gadgets which use it
  
 https://en.wikipedia.org/wiki/User:Jackmcbarn/editProtectedHelper uses 
 parsoidObj from i think parsoid itself
 this potential was never exposed to developers, not to mention end users
 this software is very scriptable and flexible
  
  

Sorry, are you referring to the potential of that gadget, or Parsoid?

Parsoid exposes a public API at,
http://parsoid-lb.eqiad.wikimedia.org/

which is documented at,
https://www.mediawiki.org/wiki/Parsoid/API

and the output is specified at,
https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec

A list of known users is available at,
https://www.mediawiki.org/wiki/Parsoid/Users

and more are actively encouraged.

If there’re any misconceptions, let’s clear them up.

  
  
 the world picture is ugly and awkward and the superprotected scandal is 
 special as 1 staff didnt even know about this decision. need better 
 documenting. delays  mistakes.
  
 what i get from working with people is that one needs to make small steps, 
 carefully, and take notes; otherwise big steps may be taken in wrong direction
 and document things, go screaming and kicking, I did it! for every step made
 this way people know what is going on
  
 please keep working on documenting what on earth you're doing exactly, in 
 public
 it should be the base of the entire team
 are you doing planning in your head? design? ;) definitely not
 put it onto a public wiki, collaborate out in the open
  
 svetlana
  
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
  


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] DB deadlock on Mediawiki 1.23.2 with VisualEditor and Parsoid

2014-08-13 Thread Arlo Breault
Is this happening for all pages on the wiki,
or just for this “TestPage”?



On Monday, August 11, 2014 at 4:48 PM, Bjoern Kahl wrote:

  
 Dear Scott
 dear All,
  
 Am 11.08.14 um 11:54 schrieb C. Scott Ananian:
   What could cause this behavior and how should I configure my system to
   prevent the deadlocks? If this is a Bug in either MediWiki or the
   VisualEditor or Parsoid, how to further investigate and fix it?
   Wikitech-l mailing list
   Wikitech-l@lists.wikimedia.org (mailto:Wikitech-l@lists.wikimedia.org)
   https://lists.wikimedia.org/mailman/listinfo/wikitech-l

   
   
  Holding a database lock while a Parsoid query is made seems like a really
  bad idea. That seems like it could be a big in MediaWiki core. However,
  I'm running an almost identical setup on my machine (Debian/unstable rather
  than wheezy) as are most Parsoid developers (Ubuntu instead of Debian) and
  I've never seen this.
   
  
  
 Honestly, I was surprised running into this problem and I am still not
 sure where the culprit is. MediaWiki is used at so many places, it is
 hard to believe in a locking bug, although possible.
  
  
  Perhaps I might suggest looking closely at the code which takes database
  locks, and why it is doing so? Perhaps more details on your db
  configuration would also be helpful, if you're not using Debian and
  MediaWiki defaults.
   
  
  
 Here come more details:
  
 - Everything runs on a single server, single core.
  
 - System runs in 32 bit mode (although CPU is capable of 64 bit mode)
  
 - mySQL table type is InnoDB
  
 - no unusual mysql server options are in effect, as far as I remember
  
 - PHP is not using memcache or any other caching module
  
 - the Wiki is mostly idle, approx 30 users and highest ever concurrent
 login count was 3 users.
  
 - All tests so far done with a copy where at most one user was active
 at anytime (myself)
  
 - my LocalSettings.php might be a bit non-standard. Here some
 settings that may or may not be relevant (in order of appearance
 in LocalSettings.php):
  
 + $wgDBprefix = wiki_;
 + $wgDBTableOptions = ENGINE=InnoDB, DEFAULT CHARSET=binary;
 + $wgDBmysql5 = false;
 + $wgMainCacheType = CACHE_NONE;
 + $wgMemCachedServers = array();
 + $wgUseInstantCommons = false;
 + $wgDefaultSkin = vector;
 + $wgResourceLoaderMaxQueryLength = -1;
 + $wgGroupPermissions['*']['createaccount'] = false;
 + $wgGroupPermissions['*']['edit'] = false;
 + $wgGroupPermissions['*']['read'] = false;
 + require_once ( 'extensions/BibTex/bibtex.php' );
 + require_once ( 'extensions/MathJax/MathJax.php' );
 + $wgParserCacheType = CACHE_NONE;
 + $wgTexvc = '/extensions/Math/math/texvc';
 + $wgUseTeX = true;
 + require_once('extensions/WikiEditor/WikiEditor.php');
 + require_once $IP/extensions/Parsoid/Parsoid.php;
 + require_once $IP/extensions/VisualEditor/VisualEditor.php;
 + $wgDefaultUserOptions['visualeditor-enable'] = 1;
 + $wgVisualEditorParsoidURL = 'http://my.server.name 
 (http://my.server.name):8000';
 + $wgVisualEditorParsoidPrefix = 'testwiki';
 + $wgSessionsInObjectCache = true;
 + $wgVisualEditorParsoidForwardCookies = true;
 + require_once('extensions/UserMerge/UserMerge.php');
 + require_once($IP/extensions/LastUserLogin/LastUserLogin.php);
  
 - Also in LocalSettings.php, I have enabled debugging using:
  
 + $wgDebugLogFile = /some/path/to/mediawiki-debug-{$wgDBname}.log;
 + $wgShowSQLErrors = true;
 + $wgShowDBErrorBacktrace = true;
 + $wgDebugTimestamps = true;
 + $wgDebugDumpSql = true;
  
 Are there other / better options to get a comprehensive trace of
 what happens when?
  
  
 - - - - Please find below an excerpt from the debug log - - - -
  
 Note:
 I tweaked wfDebugTimer() to have absolute timestamps and to have
 client address/port in order to relate debug log lines to tcpdump
 logs from capturing the on-wire conversation between MediaWiki and
 Parsoid service.
  
 Sorry for the overlong lines.
  
 If I should upload the log somewhere, please tell me where.
  
  
 (1) client access to API for visual editor
  
 client.ip:55246 17:49:15.640 0.1347 1.5M Start request GET
 /wiki/api.php?format=jsonaction=visualeditorpaction=parsepage=TestPage
 HTTP HEADERS:
 HOST: my.server.name (http://my.server.name)
 X-REQUESTED-WITH: XMLHttpRequest
 USER-AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8)
 AppleWebKit/534.59.10 (KHTML, like Gecko) Version/5.1.9 Safari/534.59.10
 ACCEPT: application/json, text/javascript, */*; q=0.01
 REFERER: http://my.server.name 
 (http://my.server.name)/wiki/index.php/TestPage?veaction=edit
 ACCEPT-LANGUAGE: de-de
 ACCEPT-ENCODING: gzip, deflate
 COOKIE: wikiEditor-0-toolbar-section=advanced;
 test_wiki__session=gt5evdjgg9ebdrkcg4i6c6gqc0;
 test_wiki_UserName=TestUser; test_wiki_UserID=18;
 PHPSESSID=ts0j9nug4oa0o2c4fka2ok5ei3
 CONNECTION: keep-alive
 client.ip:55246 17:49:15.662 0.1566 1.5M [caches] main:
 EmptyBagOStuff, message: SqlBagOStuff, parser: EmptyBagOStuff
 client.ip:55246 17:49:15.730 0.2252 2.2M Connected to 

Re: [Wikitech-l] Finding all elements with an attribute (Parsoid?..)

2014-03-18 Thread Arlo Breault
Maybe you're looking for,

document.querySelectorAll([lang])


On Sun, Mar 16, 2014 at 1:35 PM, Amir E. Aharoni 
amir.ahar...@mail.huji.ac.il wrote:

 Hello,

 Is there an easy known way to find all HTML elements with an attribute that
 appear in the text of a given wiki after it's parsed?

 Here's an example of something that I need:Find all elements that have the
 HTML lang attribute, with any value. This would be useful for me for
 collecting information about the multilingualism of Wikipedia - which
 foreign languages do we incorporate in pages, how often we do it, for which
 of them we may have various fonts problems, etc. This, again, must be
 checked after the page is parsed - this attribute is very often inserted by
 templates.

 Of course, this would rely on the editors actually using this attribute,
 but this is fairly common, at least in the English Wikipedia. (Among other
 things we could compare its usage between projects.)

 I could do this by analyzing a dump, but I've got a hunch that something
 like this was already with the research that was done for Parsoid. Does
 anybody know?

 Thanks!

 --
 Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
 http://aharoni.wordpress.com
 ‪“We're living in pieces,
 I want to live in peace.” – T. Moore‬
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l