Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread John at Darkstar
You don't have to inject javascript to do user tracking. This is
possible with all kind of raw html that leads to inclusion of external
elements, including style defs for ordinary markup.
John

Daniel Kinzler skrev:
> David Gerard schrieb:
>> 2009/6/4 Gregory Maxwell :
>>
>>> Restrict site-wide JS and raw HTML injection to a smaller subset of
>>> users who have been specifically schooled in these issues.
>>
>> Is it feasible to allow admins to use raw HTML as appropriate but not
>> raw JS? Being able to fix MediaWiki: space messages with raw HTML is
>> way too useful on the occasions where it's useful.
>>
> 
> Possible yes, sensible no. Because if you can edit raw html, you can inject
> javascript.
> 
> -- daniel
> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Stephen Bain
On Fri, Jun 5, 2009 at 1:00 AM, Gregory Maxwell wrote:
>
> What exactly are people looking for that isn't available from
> stats.grok.se that isn't a privacy concern?

A good question.

Related questions are:
1) what can't be built into stats.grok.se (or other services built on
the same data)?
2) is there anything that really needs to be done by javascript?

Don't forget all of the currently available traffic analysis here:

http://stats.wikimedia.org/EN/VisitorsSampledLogRequests.htm

-- 
Stephen Bain
stephen.b...@gmail.com

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread jidanni
Actually you can take a lesson from Google, and every once in a while
prefix all links e.g.,
"http://en.wikipedia.org/url?http://en.wikipedia.org/wiki/Norflblarg";.
(some kind of recording redirector). How do I know Google does this
every so often in their search results pages? I use
DontGet{:*://*.google.*/url?*} in my wwwoffle.conf file, so the alarm
bells ring, whereas the average user would never notice the difference.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] firefogg local encode & new-upload branch update.

2009-06-04 Thread Michael Dale
As you may know I have been working on firefogg integration with 
mediaWiki. As you may also know the mwEmbed library is being designed to 
support embedding of these interfaces in arbitrary external contexts.  I 
wanted to quickly highlight a useful stand alone usage example of the 
library:

http://www.firefogg.org/make/advanced.html

This "Make Ogg" link will be something you can send to a person so they 
can encode source footage to a local ogg video file with the latest and 
greatest ogg encoders (presently the thusnelda theora encoder  & vorbis 
audio). Updates to thusnelda and other free codecs will be pushed out 
via firefogg updates.

For commons / wikimedia usage we will directly integrate firefogg (using 
that same codebase) You can see an example of how that works on the 
'new-upload' branch here: 
http://sandbox.kaltura.com/testwiki/index.php/Special:Upload ... 
hopefully we will start putting some of this on testing.wikipedia.org 
~soonish ?~

The new-upload branch feature set is quite extensive including the 
script-loader, jquery javascript refactoring, the new upload-api, new 
mv_embed video player, add media wizard etc. Any feedback and specific 
bug reports people can do will be super helpful in gearing up for 
merging this 'new-upload' branch.

For an overview see:
http://www.mediawiki.org/wiki/Media_Projects_Overview

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Brian :

> That's why WMF now has a usability lab.


Yep. They'd dive on this stuff with great glee if we can implement it
without breaking privacy or melting servers.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Brian
That's why WMF now has a usability lab.

On Thu, Jun 4, 2009 at 12:34 PM, David Gerard  wrote:

> 2009/6/4 Brian :
>
> > How does installing 3rd party analytics software help the WMF accomplish
> its
> > goals?
>
>
> Detailed analysis of how users actually use the site would be vastly
> useful in improving the sites' content and usability.
>
>
> - d.
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Andrew Garrett

On 04/06/2009, at 8:03 PM, Robert Rohde wrote:
> I would like to note for the record that Brion explicitly "endorsed"
> the padleft hack to the degree that he re-enabled it after Werdna had
> removed it. [1]
>
> Maybe he'd change his mind after looking at how the "string
> manipulation templates" are actually getting used (now in >20,000
> enwiki pages and counting), but for the moment he seems to have
> supported allowing some form of hacked together string manipulation
> system into Mediawiki.  To that end it makes more sense to have a real
> string implementation rather than the ridiculous templates we have
> now.

I wouldn't read that into it. I think it's better characterised as  
reverting attempts to create an "arms race" over the hacks.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Internal links and diacritics

2009-06-04 Thread Strainu
On Thu, Jun 4, 2009 at 10:53 PM, Ahmad Sherif  wrote:
>>
>> You have to use the MediaWiki:Linktrail page, for example:
>> http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on
>> fr.wiki).
>
>
> AFAIK, it has to be set in the language file thru $linkTrail variable,
> because it looks like that MediaWiki:Linktrail is no longer used.
>
> On Thu, Jun 4, 2009 at 10:38 PM, Tar Dániel  wrote:
>
>> You have to use the MediaWiki:Linktrail page, for example:
>> http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on
>> fr.wiki).
>>
>> D.
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

Yep, I started from there and got to
http://meta.wikimedia.org/wiki/MediaWiki_talk:Linktrail It suddenly
became all clear :)

Thank you all for your responses.

Strainu

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Internal links and diacritics

2009-06-04 Thread Roan Kattouw
2009/6/4 Strainu :
> Hi,
>
> I'm trying to format a link like this: [[musulman]]ă. On ro.wp, this
> is equivalent to [[musulman|musulman]]ă (the special letter is not
> included in the wiki link. While going through
> http://www.mediawiki.org/wiki/Markup_spec I saw that:
>
>          ::=    [ "#"
>  ] [  [] ] 
> []
>      ::=  []
>                 ::=  | 
>           ::= "A" | "B" | ... | "Y" | "Z"
>           ::= "a" | "b" | ... | "y" | "z"
>
>
> This tells me that only ASCII letters are used for this type of
> linking. However, on fr.wp I can write [[Ren]]é and this is equivalent
> to [[Ren|René]].
>
> How was this made? Is it something that can be set by from a page or
> should some php be changed?
>
The set of characters allowed in the so-called linktrail depends on
the language used, and is set in the individual LanguageXx.php files.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Internal links and diacritics

2009-06-04 Thread Aryeh Gregor
2009/6/4 Strainu :
> While going through
> http://www.mediawiki.org/wiki/Markup_spec I saw that:
>
>          ::=    [ "#"
>  ] [  [] ] 
> []
>      ::=  []
>                 ::=  | 
>           ::= "A" | "B" | ... | "Y" | "Z"
>           ::= "a" | "b" | ... | "y" | "z"
>
>
> This tells me that only ASCII letters are used for this type of
> linking.

It's wrong.  Don't trust that page too much.  It was written after the
fact to try to document the parser, not something the parser was
designed to follow.  It's almost certainly wrong in a lot of corner
cases.  (Like non-English languages, apparently.)

On Thu, Jun 4, 2009 at 3:53 PM, Ahmad Sherif wrote:
> AFAIK, it has to be set in the language file thru $linkTrail variable,
> because it looks like that MediaWiki:Linktrail is no longer used.

Correct.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Internal links and diacritics

2009-06-04 Thread Ahmad Sherif
>
> You have to use the MediaWiki:Linktrail page, for example:
> http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on
> fr.wiki).


AFAIK, it has to be set in the language file thru $linkTrail variable,
because it looks like that MediaWiki:Linktrail is no longer used.

On Thu, Jun 4, 2009 at 10:38 PM, Tar Dániel  wrote:

> You have to use the MediaWiki:Linktrail page, for example:
> http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on
> fr.wiki).
>
> D.
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Internal links and diacritics

2009-06-04 Thread Tar Dániel
You have to use the MediaWiki:Linktrail page, for example:
http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on
fr.wiki).

D.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Internal links and diacritics

2009-06-04 Thread Strainu
Hi,

I'm trying to format a link like this: [[musulman]]ă. On ro.wp, this
is equivalent to [[musulman|musulman]]ă (the special letter is not
included in the wiki link. While going through
http://www.mediawiki.org/wiki/Markup_spec I saw that:

 ::=[ "#"
 ] [  [] ] 
[]
 ::=  []
::=  | 
  ::= "A" | "B" | ... | "Y" | "Z"
  ::= "a" | "b" | ... | "y" | "z"


This tells me that only ASCII letters are used for this type of
linking. However, on fr.wp I can write [[Ren]]é and this is equivalent
to [[Ren|René]].

How was this made? Is it something that can be set by from a page or
should some php be changed?

Thanks,
   Strainu

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 11:52 AM, Aryeh Gregor
 wrote:
> Note, though, that there are some that are already possible to some
> extent.  You can use the core padright/padleft functions to emulate a
> couple of the added functions.  E.g.:
>
> http://en.wikipedia.org/w/index.php?title=Template:Str_len&action=edit


I would like to note for the record that Brion explicitly "endorsed"
the padleft hack to the degree that he re-enabled it after Werdna had
removed it. [1]

Maybe he'd change his mind after looking at how the "string
manipulation templates" are actually getting used (now in >20,000
enwiki pages and counting), but for the moment he seems to have
supported allowing some form of hacked together string manipulation
system into Mediawiki.  To that end it makes more sense to have a real
string implementation rather than the ridiculous templates we have
now.

-Robert Rohde

[1] http://svn.wikimedia.org/viewvc/mediawiki?view=rev&revision=47411

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 11:29 AM, Brian  wrote:
> I was privy to a #mediawiki conversation between brion/tim where tim pointed
> out that at least one person plans to implement a Natural Language
> Processing parser for English using StringFunctions just as soon as they are
> enabled.
>
> It's pretty obvious that you can implement all sorts crazy algorithms using
> StringFunctions. They need to be limited so that is not possible.

If you are referring to the conversation I think you are, then my
impression was Tim was speaking hypothetically about the issue rather
than knowing someone that had this specific intent.

I'm fairly dubious about anyone actually trying natural language
processing to any serious degree.  Real natural language processing
needs huge lookup tables to identify part of speech and relationships
etc.  Technically possible I suppose, but not easy to do.

I'm even more dubious that full fledged natural language processing --
in templates -- would find significant uses.  It is more efficient and
more practical to view templates as simple formatting macros rather
than as a system for real natural language interaction.  There are
very useful things that can be done with simple string algorithms,
such as detecting the "(bar)" when given a title like "Foo (bar)", but
I wouldn't expect anyone to be answering queries with them or anything
like that.

When providing tools to content creators, flexibility is generally a
positive design feature.  We shouldn't go overboard with imposing
limits in the advance of actual problems.

The current implementation is artificially limited to 1000 characters
or less, which does prevent huge manipulations, however.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Aryeh Gregor
On Thu, Jun 4, 2009 at 2:29 PM, Brian wrote:
> I was privy to a #mediawiki conversation between brion/tim where tim pointed
> out that at least one person plans to implement a Natural Language
> Processing parser for English using StringFunctions just as soon as they are
> enabled.
>
> It's pretty obvious that you can implement all sorts crazy algorithms using
> StringFunctions. They need to be limited so that is not possible.

Note, though, that there are some that are already possible to some
extent.  You can use the core padright/padleft functions to emulate a
couple of the added functions.  E.g.:

http://en.wikipedia.org/w/index.php?title=Template:Str_len&action=edit

The most template-heavy pages already tend to run close to the
template limits, until they're cut down by users when they fail.  It's
not clear to me that allowing more functions would actually increase
overall load or template complexity significantly.  It might decrease
it by allowing simpler and more efficient implementations of things
that currently need to be worked around.  It can't really increase it
too much, theoretically -- that's what the template limits are for.

Werdna points out that Tim did say this morning in #mediawiki that
he'd probably revert the change.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Chad
On Thu, Jun 4, 2009 at 2:32 PM, David Gerard  wrote:
> 2009/6/4 Gregory Maxwell :
>
>> I think the biggest problem to reducing accesses is that far more
>> mediawiki messages are uncooked than is needed. Were it not for this I
>> expect this access would have been curtailed somewhat a long time ago.
>
>
> I think you've hit the actual problem there. Someone with too much
> time on their hands who could go through all of the MediaWiki: space
> to see what really needs to be HTML rather than wikitext?
>
>
> - d.
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

See bug 212[1], which is (sort of) a tracker for the wikitext-ification
of the messages.

-Chad

[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=212

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Brian :

> How does installing 3rd party analytics software help the WMF accomplish its
> goals?


Detailed analysis of how users actually use the site would be vastly
useful in improving the sites' content and usability.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Gregory Maxwell :

> I think the biggest problem to reducing accesses is that far more
> mediawiki messages are uncooked than is needed. Were it not for this I
> expect this access would have been curtailed somewhat a long time ago.


I think you've hit the actual problem there. Someone with too much
time on their hands who could go through all of the MediaWiki: space
to see what really needs to be HTML rather than wikitext?


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Aryeh Gregor
On Thu, Jun 4, 2009 at 11:56 AM, Neil Harris wrote:
> However; writing a javascript sanitizer that restricted the user to a
> "safe" subset of the language, by first parsing and then resynthesizing
> the code using formal methods for validation, in a way similar to the
> current solution for TeX, would be an interesting project!

Interesting, but probably not very useful.  If we restricted
JavaScript the way we restricted TeX, we'd have to ban function
definitions, loops, conditionals, and most function calls.  I suspect
you'd have to make it pretty much unusable to make output of specific
strings impossible.

On Thu, Jun 4, 2009 at 12:45 PM, Gregory Maxwell wrote:
> Regarding HTML sanitation: Raw HTML alone without JS is enough to
> violate users privacy: Just add a hidden image tag to a remote site.
> Yes you could sanitize out various bad things, but then thats not raw
> HTML anymore, is it?

It might be good enough for the purposes at hand, though.  What are
the use-cases for wanting raw HTML in messages, instead of wikitext or
plaintext?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Andrew Garrett :

> When did we start treating our administrators as potentially malicious
> attackers? Any administrator could, in theory, add a cookie-stealing
> script to my user JS, steal my account, and grant themselves any
> rights they please.


That's why I started this thread talking about things being done right
now by well-meaning admins :-)


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Brian
I was privy to a #mediawiki conversation between brion/tim where tim pointed
out that at least one person plans to implement a Natural Language
Processing parser for English using StringFunctions just as soon as they are
enabled.

It's pretty obvious that you can implement all sorts crazy algorithms using
StringFunctions. They need to be limited so that is not possible.

On Thu, Jun 4, 2009 at 10:19 AM, Robert Rohde  wrote:

> On Thu, Jun 4, 2009 at 9:05 AM, Andrew Garrett 
> wrote:
> >
> > On 04/06/2009, at 3:46 PM, H. Langos wrote:
> >
> >> On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
> >>> Seems like (at least) the API of #pos in ParserFunctions is
> >>> different from the one in StringFunctions.
> >>>
> >>> {{#pos: haysack|needle|offset}}
> >>>
> >>> While the StringFunctions #pos in MediaWiki 1.14 returned an
> >>> empty string when the needle was not found, the ParserFunctions
> >>> implementation of #pos in svn now returns -1.
> >>>
> >>
> >> I forgot to ask THE question. Is it a bug or is there some good reason
> >> to break backward compatibility?
> >>
> >> And no, programming language cosmetics is not a good reason. :-)
> >>
> >> If something has the same interface, it should have the same
> >> behaviour.
> >> If the old semantics was too awful to bare, the new one should have
> >> been
> >> called #strpos or #fpos (for forward-#pos. #rpos always had the
> >> "-1 return on no-found" behaviour).
> >
> > This should be left as a comment on the relevant revision in
> > CodeReview. Note that it's likely irrelevant anyway, as, in all
> > likelihood, the merge of String and Parser Functions will be reverted.
>
> Two devs, who shall remain nameless unless they choose to take credit
> for it, explicitly encouraged the merge.  Personally, I've always
> thought it made more sense to keep these as separate extensions but I
> went along with what they encouraged me to do.
>
> Regardless of whether it is one extension or two, I do strongly feel
> that once a technically acceptable implementation of string functions
> exists then it should be enabled on WMF sites.  (I agree though that
> the previous StringFunctions was rightly excluded due to
> implementation problems.)
>
> -Robert Rohde
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Finne Boonen :
> On Thu, Jun 4, 2009 at 17:00, Gregory Maxwell  wrote:

>> What exactly are people looking for that isn't available from
>> stats.grok.se that isn't a privacy concern?
>> I had assumed that people kept installing these bugs because they
>> wanted source network break downs per-article and other clear privacy
>> violations.

> On top of views/page
> I'd be interested in keywords used, entry&exit points, path analysis
> when people are editing (do they save/leave/try to find help/...)
> #edit starts, #submitted edits that don't get saved.


Path analysis is a big one. All that other stuff, if it won't violate
privacy, would be fantastically useful to researchers, internal and
external, in ways we won't have even thought of yet, and help us
considerably to improve the projects.

(This would have to be given considerable thought from a
security/hacker mindset - e.g. even with IPs stripped, listing user
pages and user page edits would likely give away an identity. Talk
pages may do the same. Those are just off the top of my head, I'm sure
someone has already made a list of what they could work out even with
IPs anonymised or even stripped.)


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Aryeh Gregor
On Thu, Jun 4, 2009 at 12:05 PM, Andrew Garrett wrote:
> Note that it's likely irrelevant anyway, as, in all
> likelihood, the merge of String and Parser Functions will be reverted.

Have Tim or Brion said this?
 is the only
clear statement I've seen by either of them that I can recall.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Brian
How does installing 3rd party analytics software help the WMF accomplish its
goals?

On Thu, Jun 4, 2009 at 8:31 AM, Daniel Kinzler  wrote:

> David Gerard schrieb:
> > Keeping well-meaning admins from putting Google web bugs in the
> > JavaScript is a game of whack-a-mole.
> >
> > Are there any technical workarounds feasible? If not blocking the
> > loading of external sites entirely (I understand hu:wp uses a web bug
> > that isn't Google), perhaps at least listing the sites somewhere
> > centrally viewable?
>
> Perhaps the solution would be to simply set up our own JS based usage
> tracker?
> There are a few options available
> , and for
> starters,
> the backend could run on the toolserver.
>
> Note that anything processing IP addresses will need special approval on
> the TS.
>
> -- daniel
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Gregory Maxwell
On Thu, Jun 4, 2009 at 12:04 PM, Andrew Garrett  wrote:
>>> Is it feasible to allow admins to use raw HTML as appropriate but not
>>> raw JS? Being able to fix MediaWiki: space messages with raw HTML is
>>> way too useful on the occasions where it's useful.
>>>
>>
>> Possible yes, sensible no. Because if you can edit raw html, you can
>> inject
>> javascript.
>
>
> When did we start treating our administrators as potentially malicious
> attackers? Any administrator could, in theory, add a cookie-stealing
> script to my user JS, steal my account, and grant themselves any
> rights they please.
>
> We trust our administrators. If we don't, we should move the
> editinterface right further up the chain.

90% of possibly malicious the things administrators could possibly do
is easily un-doable. Most of the remainder is limited in scope to
impacting few users at a time.  Site wide JS can cause irreparable
harm to users privacy and do it to hundreds of thousands in an
instant.

Outside of raw html and JS no other admin feature grants the ability
to completely disable third party sites.

And forget malice— there is no reason for admins to add remote loading
external resources. Any such addition is going to violate the privacy
policy.  Yet it keeps happening.

You don't have to be malicious to be completely unaware of the privacy
implications or of the potential DOS risks. You don't have to be
malicious to use the same sloppy editing practices which are used on
easily and instantly revertible articles while editing site messages
and JS (Caching ensures that many JS and message mistakes aren't
completely undone for many hours). Though we shouldn't preclude the
possibility of occasional malice: It isn't as though we haven't had
admin's choose easily guessable passwords in the past, or admins flip
their lids and attempt to cause problems.

In places where the harm is confined and can be undone softer security
measures make sense. As the destructive and distractive power
increases the appropriate level of security also increases.

We impose stiffer regulation for access permissions like checkuser…
even though an admins ability to add webbugs is significantly more
powerful a privacy invasion tool than checkuser. (Checkusers can't see
typical reader activities!)

Raw HTML and JS have drastically different implications than most
other 'admin' functions. Accordingly the optimal security behaviour is
different.  When there are few enough admins the problems are
infrequent enough to ignore, but as things grow...

The number of uses of site-wide JS and Raw HTML are fairly limited. As
are the number of users with the technical skills required to use them
correctly.  Arguably every instance of user manipulation of raw HTML
and site-wide JS is a deficiency in mediawiki.


Regarding HTML sanitation: Raw HTML alone without JS is enough to
violate users privacy: Just add a hidden image tag to a remote site.
Yes you could sanitize out various bad things, but then thats not raw
HTML anymore, is it?

I think the biggest problem to reducing accesses is that far more
mediawiki messages are uncooked than is needed. Were it not for this I
expect this access would have been curtailed somewhat a long time ago.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
On Thu, Jun 04, 2009 at 09:11:31AM -0700, Robert Rohde wrote:
> On Thu, Jun 4, 2009 at 6:55 AM, H. Langos  wrote:
> > Seems like (at least) the API of #pos in ParserFunctions is
> > different from the one in StringFunctions.
> >
> > {{#pos: haysack|needle|offset}}
> >
> > While the StringFunctions #pos in MediaWiki 1.14 returned an
> > empty string when the needle was not found, the ParserFunctions
> > implementation of #pos in svn now returns -1.
> 
> 
> Prior to the merge 100% of the StringFunction function calls were
> reimplemented, principally for performance and security reasons.
> 
> The short but uninspired answer to your question is that in doing that
> I didn't notice that #pos and #rpos had different default behavior.
> Given the way that #if works, returning empty string is a reasonable
> response to a string-not-found condition, and I am happy to change
> that back.  I'll also recheck to make sure there aren't any other
> unexpected behavioral changes.

That would be very much apreciated. 

> Though they don't have to have the same behavior, I'd be inclined to
> argue that #pos and #rpos really ought to have the same default
> behavior on usability grounds, i.e. either both giving -1 or both
> giving empty string when a match is not found.  Though since that does
> create compatibility issues with existing StringFunctions users, I'll
> defer to others about whether consistency would be a good enough
> motivation in this case.

I'd argue that eventhough pos and rpos are very similar functions, their use
cases are very dissimilar. I.e. as long as there is no (regex)match function
the #pos function is defacto its replacement.

> I should warn you though that there is an intentional behavioral
> change regarding the handling of strip markers.  The pre-existing
> StringFunctions codebase reacted to strip markers in a way that was
> inefficient, hard for the end user to predict, and in specially
> crafted cases created security issues.
> 
> The following example is illustrative of the change.
> 
> Consider the string "ABCjklDEFmnoGHI"
> 
> In the new implementation this is treated internally as "ABCDEFGHI" by
> the string routines.  Hence it's length is 9 and it's first five
> characters are ABCDE.
> 
> For complicated reasons the StringFunctions version says its length is
> 7 and the first "five" characters are ABCjklDEFmnoG.

That change sounds rather like a bugfix than changing an "intended" 
behaviour. :-)

BTW: Is there a way to find articles that use ParserFunctions like there is
a way to locate usage of Templates ? This would allow users to locate all
places that a user would need to pay attention to when upgrading their 
mediawiki installation?

cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Mike.lifeguard
On Thu, 2009-06-04 at 17:04 +0100, Andrew Garrett wrote:

> When did we start treating our administrators as potentially malicious  
> attackers? Any administrator could, in theory, add a cookie-stealing  
> script to my user JS, steal my account, and grant themselves any  
> rights they please.
> 
> We trust our administrators. If we don't, we should move the  
> editinterface right further up the chain.


They are potentially malicious attackers, but we nevertheless trust them
not to do bad things. "We" in this case refers only to most of
Wikimedia, I guess, since there has been no shortage of paranoia both on
bugzilla and this list recently - a sad state of affairs to be sure.

-Mike
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 9:05 AM, Andrew Garrett  wrote:
>
> On 04/06/2009, at 3:46 PM, H. Langos wrote:
>
>> On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
>>> Seems like (at least) the API of #pos in ParserFunctions is
>>> different from the one in StringFunctions.
>>>
>>> {{#pos: haysack|needle|offset}}
>>>
>>> While the StringFunctions #pos in MediaWiki 1.14 returned an
>>> empty string when the needle was not found, the ParserFunctions
>>> implementation of #pos in svn now returns -1.
>>>
>>
>> I forgot to ask THE question. Is it a bug or is there some good reason
>> to break backward compatibility?
>>
>> And no, programming language cosmetics is not a good reason. :-)
>>
>> If something has the same interface, it should have the same
>> behaviour.
>> If the old semantics was too awful to bare, the new one should have
>> been
>> called #strpos or #fpos (for forward-#pos. #rpos always had the
>> "-1 return on no-found" behaviour).
>
> This should be left as a comment on the relevant revision in
> CodeReview. Note that it's likely irrelevant anyway, as, in all
> likelihood, the merge of String and Parser Functions will be reverted.

Two devs, who shall remain nameless unless they choose to take credit
for it, explicitly encouraged the merge.  Personally, I've always
thought it made more sense to keep these as separate extensions but I
went along with what they encouraged me to do.

Regardless of whether it is one extension or two, I do strongly feel
that once a technically acceptable implementation of string functions
exists then it should be enabled on WMF sites.  (I agree though that
the previous StringFunctions was rightly excluded due to
implementation problems.)

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
On Thu, Jun 04, 2009 at 05:05:50PM +0100, Andrew Garrett wrote:
> 
> On 04/06/2009, at 3:46 PM, H. Langos wrote:
> 
> > On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
> >> Seems like (at least) the API of #pos in ParserFunctions is
> >> different from the one in StringFunctions.
> >>
> >> {{#pos: haysack|needle|offset}}
> >>
> >> While the StringFunctions #pos in MediaWiki 1.14 returned an
> >> empty string when the needle was not found, the ParserFunctions
> >> implementation of #pos in svn now returns -1.
> >>
> >
> > I forgot to ask THE question. Is it a bug or is there some good reason
> > to break backward compatibility?
> >
> > And no, programming language cosmetics is not a good reason. :-)
> >
> > If something has the same interface, it should have the same  
> > behaviour.
> > If the old semantics was too awful to bare, the new one should have  
> > been
> > called #strpos or #fpos (for forward-#pos. #rpos always had the
> > "-1 return on no-found" behaviour).
> 
> This should be left as a comment on the relevant revision in  
> CodeReview. Note that it's likely irrelevant anyway, as, in all  
> likelihood, the merge of String and Parser Functions will be reverted.

Sorry to bother you but I am not a wikimedia developer so I wouldn't know
where to start looking.

Could you point me to the right place/list/article? The svn revision with 
the String and Parser Functions merge was 50997.

cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Mike.lifeguard
Thanks, that clarifies matters for me. I wasn't aware of #1, though I
guess upon reflection that makes sense.

-Mike

On Thu, 2009-06-04 at 11:07 -0400, Gregory Maxwell wrote:

> On Thu, Jun 4, 2009 at 11:01 AM, Mike.lifeguard
>  wrote:
> > On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote:
> >
> >> Then external site loading can be blocked.
> >
> >
> > Why do we need to block loading from all external sites? If there are
> > specific & problematic ones (like google analytics) then why not block
> > those?
> 
> Because:
> 
> (1) External loading results in an uncontrolled leak of private reader
> and editor information to third parties, in contravention of the
> privacy policy as well as basic ethical operating principles.
> 
> (1a) most external loading script usage will also defeat users choice
> of SSL and leak more information about their browsing to their local
> network. It may also bypass any wikipedia specific anonymization
> proxies they are using to keep their reading habits private.
> 
> (2) External loading produces a runtime dependency on third party
> sites. Some other site goes down and our users experience some kind of
> loss of service.
> 
> (3) The availability of external loading makes Wikimedia a potential
> source of very significant DDOS attacks, intentional or otherwise.
> 
> Thats not to say that there aren't reasons to use remote loading, but
> the potential harms mean that it should probably be a default-deny
> permit-by-exception process rather than the other way around.
> 
> 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Robert Rohde
On Thu, Jun 4, 2009 at 6:55 AM, H. Langos  wrote:
> Seems like (at least) the API of #pos in ParserFunctions is
> different from the one in StringFunctions.
>
> {{#pos: haysack|needle|offset}}
>
> While the StringFunctions #pos in MediaWiki 1.14 returned an
> empty string when the needle was not found, the ParserFunctions
> implementation of #pos in svn now returns -1.


Prior to the merge 100% of the StringFunction function calls were
reimplemented, principally for performance and security reasons.

The short but uninspired answer to your question is that in doing that
I didn't notice that #pos and #rpos had different default behavior.
Given the way that #if works, returning empty string is a reasonable
response to a string-not-found condition, and I am happy to change
that back.  I'll also recheck to make sure there aren't any other
unexpected behavioral changes.

Though they don't have to have the same behavior, I'd be inclined to
argue that #pos and #rpos really ought to have the same default
behavior on usability grounds, i.e. either both giving -1 or both
giving empty string when a match is not found.  Though since that does
create compatibility issues with existing StringFunctions users, I'll
defer to others about whether consistency would be a good enough
motivation in this case.


I should warn you though that there is an intentional behavioral
change regarding the handling of strip markers.  The pre-existing
StringFunctions codebase reacted to strip markers in a way that was
inefficient, hard for the end user to predict, and in specially
crafted cases created security issues.

The following example is illustrative of the change.

Consider the string "ABCjklDEFmnoGHI"

In the new implementation this is treated internally as "ABCDEFGHI" by
the string routines.  Hence it's length is 9 and it's first five
characters are ABCDE.

For complicated reasons the StringFunctions version says its length is
7 and the first "five" characters are ABCjklDEFmnoG.

-Robert Rohde

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread Andrew Garrett

On 04/06/2009, at 3:46 PM, H. Langos wrote:

> On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
>> Seems like (at least) the API of #pos in ParserFunctions is
>> different from the one in StringFunctions.
>>
>> {{#pos: haysack|needle|offset}}
>>
>> While the StringFunctions #pos in MediaWiki 1.14 returned an
>> empty string when the needle was not found, the ParserFunctions
>> implementation of #pos in svn now returns -1.
>>
>
> I forgot to ask THE question. Is it a bug or is there some good reason
> to break backward compatibility?
>
> And no, programming language cosmetics is not a good reason. :-)
>
> If something has the same interface, it should have the same  
> behaviour.
> If the old semantics was too awful to bare, the new one should have  
> been
> called #strpos or #fpos (for forward-#pos. #rpos always had the
> "-1 return on no-found" behaviour).

This should be left as a comment on the relevant revision in  
CodeReview. Note that it's likely irrelevant anyway, as, in all  
likelihood, the merge of String and Parser Functions will be reverted.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Andrew Garrett

On 04/06/2009, at 4:08 PM, Daniel Kinzler wrote:

> David Gerard schrieb:
>> 2009/6/4 Gregory Maxwell :
>>
>>> Restrict site-wide JS and raw HTML injection to a smaller subset of
>>> users who have been specifically schooled in these issues.
>>
>>
>> Is it feasible to allow admins to use raw HTML as appropriate but not
>> raw JS? Being able to fix MediaWiki: space messages with raw HTML is
>> way too useful on the occasions where it's useful.
>>
>
> Possible yes, sensible no. Because if you can edit raw html, you can  
> inject
> javascript.


When did we start treating our administrators as potentially malicious  
attackers? Any administrator could, in theory, add a cookie-stealing  
script to my user JS, steal my account, and grant themselves any  
rights they please.

We trust our administrators. If we don't, we should move the  
editinterface right further up the chain.

--
Andrew Garrett
Contract Developer, Wikimedia Foundation
agarr...@wikimedia.org
http://werdn.us




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Neil Harris
Neil Harris wrote:
> Daniel Kinzler wrote:
>   
>> David Gerard schrieb:
>>   
>> 
>>> 2009/6/4 Gregory Maxwell :
>>>
>>> 
>>>   
 Restrict site-wide JS and raw HTML injection to a smaller subset of
 users who have been specifically schooled in these issues.
   
 
>>> Is it feasible to allow admins to use raw HTML as appropriate but not
>>> raw JS? Being able to fix MediaWiki: space messages with raw HTML is
>>> way too useful on the occasions where it's useful.
>>>
>>> 
>>>   
>> Possible yes, sensible no. Because if you can edit raw html, you can inject
>> javascript.
>>
>> -- daniel
>>
>>   
>> 
> Not if you sanitize the HTML after the fact: just cleaning out 

Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Thomas Dalton
2009/6/4 Mike.lifeguard :
> On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote:
>
>> Then external site loading can be blocked.
>
>
> Why do we need to block loading from all external sites? If there are
> specific & problematic ones (like google analytics) then why not block
> those?

I can't think of any time when we would need to load anything from an
external site, so why not block them completely and eliminate the
privacy concern entirely?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Finne Boonen
On Thu, Jun 4, 2009 at 17:00, Gregory Maxwell  wrote:
> On Thu, Jun 4, 2009 at 10:53 AM, David Gerard  wrote:
>> I understand the problem with stats before was that the stats server
>> would melt under the load. Leon's old wikistats page sampled 1:1000.
>> The current stats (on dammit.lt and served up nicely on
>> http://stats.grok.se) are every hit, but I understand (Domas?) that it
>> was quite a bit of work to get the firehose of data in such a form as
>> not to melt the receiving server trying to process it.
>>
>> OK, then the problem becomes: how to set up something like
>> stats.grok.se feasibly internally for all the other data gathered from
>> a hit? (Modulo stuff that needs to be blanked per privacy policy.)
>
> What exactly are people looking for that isn't available from
> stats.grok.se that isn't a privacy concern?
>
> I had assumed that people kept installing these bugs because they
> wanted source network break downs per-article and other clear privacy
> violations.

On top of views/page
I'd be interested in keywords used, entry&exit points, path analysis
when people are editing (do they save/leave/try to find help/...)
#edit starts, #submitted edits that don't get saved.

henna

-- 
"Maybe you knew early on that your track went from point A to B, but
unlike you I wasn't given a map at birth!" Alyssa, "Chasing Amy"

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Neil Harris
Daniel Kinzler wrote:
> David Gerard schrieb:
>   
>> 2009/6/4 Gregory Maxwell :
>>
>> 
>>> Restrict site-wide JS and raw HTML injection to a smaller subset of
>>> users who have been specifically schooled in these issues.
>>>   
>> Is it feasible to allow admins to use raw HTML as appropriate but not
>> raw JS? Being able to fix MediaWiki: space messages with raw HTML is
>> way too useful on the occasions where it's useful.
>>
>> 
>
> Possible yes, sensible no. Because if you can edit raw html, you can inject
> javascript.
>
> -- daniel
>
>   
Not if you sanitize the HTML after the fact: just cleaning out 

Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Daniel Kinzler
David Gerard schrieb:
> 2009/6/4 Mike.lifeguard :
>> On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote:
> 
>>> Then external site loading can be blocked.
> 
>> Why do we need to block loading from all external sites? If there are
>> specific & problematic ones (like google analytics) then why not block
>> those?
> 
> 
> Because having the data go outside Wikimedia at all is a privacy
> policy violation, as I understand it (please correct me if I'm wrong).

I agree with that, *especially* if it's for the purpose of aggregating data
about users.

-- daniel



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Foundation-l] Wikipedia tracks user behaviour via third party companies

2009-06-04 Thread Neil Harris
David Gerard wrote:
> Web bugs for statistical data are a legitimate want but potentially a
> horrible privacy violation.
>
> So I asked on wikitech-l, and the obvious answer appears to be to do
> it internally. Something like http://stats.grok.se/ only more so.
>
> So - if you want web bug data in a way that fits the privacy policy,
> please pop over to the wikitech-l thread with technical suggestions
> and solutions :-)
>
>
> - d.
Yes, modifying the http://stats.grok.se/ systems looks like the way to go.

What do people actually want to see from the traffic data? Do they want 
referrers, anonymized user trails, or what?

-- Neil


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Daniel Kinzler
David Gerard schrieb:
> 2009/6/4 Gregory Maxwell :
> 
>> Restrict site-wide JS and raw HTML injection to a smaller subset of
>> users who have been specifically schooled in these issues.
> 
> 
> Is it feasible to allow admins to use raw HTML as appropriate but not
> raw JS? Being able to fix MediaWiki: space messages with raw HTML is
> way too useful on the occasions where it's useful.
> 

Possible yes, sensible no. Because if you can edit raw html, you can inject
javascript.

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Gregory Maxwell
On Thu, Jun 4, 2009 at 11:01 AM, Mike.lifeguard
 wrote:
> On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote:
>
>> Then external site loading can be blocked.
>
>
> Why do we need to block loading from all external sites? If there are
> specific & problematic ones (like google analytics) then why not block
> those?

Because:

(1) External loading results in an uncontrolled leak of private reader
and editor information to third parties, in contravention of the
privacy policy as well as basic ethical operating principles.

(1a) most external loading script usage will also defeat users choice
of SSL and leak more information about their browsing to their local
network. It may also bypass any wikipedia specific anonymization
proxies they are using to keep their reading habits private.

(2) External loading produces a runtime dependency on third party
sites. Some other site goes down and our users experience some kind of
loss of service.

(3) The availability of external loading makes Wikimedia a potential
source of very significant DDOS attacks, intentional or otherwise.

Thats not to say that there aren't reasons to use remote loading, but
the potential harms mean that it should probably be a default-deny
permit-by-exception process rather than the other way around.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Gregory Maxwell
On Thu, Jun 4, 2009 at 10:53 AM, David Gerard  wrote:
> I understand the problem with stats before was that the stats server
> would melt under the load. Leon's old wikistats page sampled 1:1000.
> The current stats (on dammit.lt and served up nicely on
> http://stats.grok.se) are every hit, but I understand (Domas?) that it
> was quite a bit of work to get the firehose of data in such a form as
> not to melt the receiving server trying to process it.
>
> OK, then the problem becomes: how to set up something like
> stats.grok.se feasibly internally for all the other data gathered from
> a hit? (Modulo stuff that needs to be blanked per privacy policy.)

What exactly are people looking for that isn't available from
stats.grok.se that isn't a privacy concern?

I had assumed that people kept installing these bugs because they
wanted source network break downs per-article and other clear privacy
violations.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Mike.lifeguard :
> On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote:

>> Then external site loading can be blocked.

> Why do we need to block loading from all external sites? If there are
> specific & problematic ones (like google analytics) then why not block
> those?


Because having the data go outside Wikimedia at all is a privacy
policy violation, as I understand it (please correct me if I'm wrong).


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Gregory Maxwell :

> Restrict site-wide JS and raw HTML injection to a smaller subset of
> users who have been specifically schooled in these issues.


Is it feasible to allow admins to use raw HTML as appropriate but not
raw JS? Being able to fix MediaWiki: space messages with raw HTML is
way too useful on the occasions where it's useful.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Mike.lifeguard
On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote:

> Then external site loading can be blocked.


Why do we need to block loading from all external sites? If there are
specific & problematic ones (like google analytics) then why not block
those?

-Mike
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Neil Harris
Michael Rosenthal wrote:
> I suggest keep the bug on Wikimedia's servers and using a tool which
> relies on SQL databases. These could be shared with the toolserver
> where the "official" version of the analysis tool runs and users are
> enabled to run their own queries (so taking a tool with a good
> database structure would be nice). With that the toolserver users
> could set up their own cool tools on that data.
>   

If Javascript was used to serve the bug, it would be quite easy to only 
load the bug some small fraction of the time, allowing a fair 
statistical sample of JS-enabled readers (who should, I hope, be fairly 
representative of the whole population) to be taken without melting down 
the servers.

I suspect the fact that most bots and spiders do not interpret 
Javascript, and would thus be excluded from participating in the traffic 
survey, could be regarded as an added bonus.

-- Neil


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Gregory Maxwell
On Thu, Jun 4, 2009 at 10:19 AM, David Gerard  wrote:
> Keeping well-meaning admins from putting Google web bugs in the
> JavaScript is a game of whack-a-mole.
>
> Are there any technical workarounds feasible? If not blocking the
> loading of external sites entirely (I understand hu:wp uses a web bug
> that isn't Google), perhaps at least listing the sites somewhere
> centrally viewable?

Restrict site-wide JS and raw HTML injection to a smaller subset of
users who have been specifically schooled in these issues.


This approach is also compatible with other approaches. It has the
advantage of being simple to implement and should produce a
considerable reduction in problems regardless of the underlying cause.


Just be glad no one has yet turned english wikipedia's readers into
their own personal DDOS drone network.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Daniel Kinzler
Michael Rosenthal schrieb:
> I suggest keep the bug on Wikimedia's servers and using a tool which
> relies on SQL databases. These could be shared with the toolserver
> where the "official" version of the analysis tool runs and users are
> enabled to run their own queries (so taking a tool with a good
> database structure would be nice). With that the toolserver users
> could set up their own cool tools on that data.

Well, the original problem is that wikipedia has so many page views, writing
each to a database will simply melt that database. we are talking about 5
hits per second. this is of course also true for the toolserver.

I was thinking about a solution that uses sampling, or would only be applied to
specific pages or small projects. We hat something similar for the old 
wikicharts.

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Michael Rosenthal :

> I suggest keep the bug on Wikimedia's servers and using a tool which
> relies on SQL databases. These could be shared with the toolserver
> where the "official" version of the analysis tool runs and users are
> enabled to run their own queries (so taking a tool with a good
> database structure would be nice). With that the toolserver users
> could set up their own cool tools on that data.


I understand the problem with stats before was that the stats server
would melt under the load. Leon's old wikistats page sampled 1:1000.
The current stats (on dammit.lt and served up nicely on
http://stats.grok.se) are every hit, but I understand (Domas?) that it
was quite a bit of work to get the firehose of data in such a form as
not to melt the receiving server trying to process it.

OK, then the problem becomes: how to set up something like
stats.grok.se feasibly internally for all the other data gathered from
a hit? (Modulo stuff that needs to be blanked per privacy policy.)


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Michael Rosenthal
I suggest keep the bug on Wikimedia's servers and using a tool which
relies on SQL databases. These could be shared with the toolserver
where the "official" version of the analysis tool runs and users are
enabled to run their own queries (so taking a tool with a good
database structure would be nice). With that the toolserver users
could set up their own cool tools on that data.

On Thu, Jun 4, 2009 at 4:34 PM, David Gerard  wrote:
> 2009/6/4 Daniel Kinzler :
>> David Gerard schrieb:
>
>>> Keeping well-meaning admins from putting Google web bugs in the
>>> JavaScript is a game of whack-a-mole.
>>> Are there any technical workarounds feasible? If not blocking the
>
>> Perhaps the solution would be to simply set up our own JS based usage 
>> tracker?
>> There are a few options available
>> , and for 
>> starters,
>> the backend could run on the toolserver.
>> Note that anything processing IP addresses will need special approval on the 
>> TS.
>
>
> If putting that on the toolserver passes privacy policy muster, that'd
> be an excellent solution. Then external site loading can be blocked.
>
> (And if the toolservers won't melt in the process.)
>
>
> - d.
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote:
> Seems like (at least) the API of #pos in ParserFunctions is
> different from the one in StringFunctions.
> 
> {{#pos: haysack|needle|offset}}
> 
> While the StringFunctions #pos in MediaWiki 1.14 returned an
> empty string when the needle was not found, the ParserFunctions
> implementation of #pos in svn now returns -1.
> 

I forgot to ask THE question. Is it a bug or is there some good reason 
to break backward compatibility?

And no, programming language cosmetics is not a good reason. :-)

If something has the same interface, it should have the same behaviour. 
If the old semantics was too awful to bare, the new one should have been
called #strpos or #fpos (for forward-#pos. #rpos always had the 
"-1 return on no-found" behaviour).


cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
2009/6/4 Daniel Kinzler :
> David Gerard schrieb:

>> Keeping well-meaning admins from putting Google web bugs in the
>> JavaScript is a game of whack-a-mole.
>> Are there any technical workarounds feasible? If not blocking the

> Perhaps the solution would be to simply set up our own JS based usage tracker?
> There are a few options available
> , and for 
> starters,
> the backend could run on the toolserver.
> Note that anything processing IP addresses will need special approval on the 
> TS.


If putting that on the toolserver passes privacy policy muster, that'd
be an excellent solution. Then external site loading can be blocked.

(And if the toolservers won't melt in the process.)


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread Daniel Kinzler
David Gerard schrieb:
> Keeping well-meaning admins from putting Google web bugs in the
> JavaScript is a game of whack-a-mole.
> 
> Are there any technical workarounds feasible? If not blocking the
> loading of external sites entirely (I understand hu:wp uses a web bug
> that isn't Google), perhaps at least listing the sites somewhere
> centrally viewable?

Perhaps the solution would be to simply set up our own JS based usage tracker?
There are a few options available
, and for starters,
the backend could run on the toolserver.

Note that anything processing IP addresses will need special approval on the TS.

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?

2009-06-04 Thread David Gerard
Keeping well-meaning admins from putting Google web bugs in the
JavaScript is a game of whack-a-mole.

Are there any technical workarounds feasible? If not blocking the
loading of external sites entirely (I understand hu:wp uses a web bug
that isn't Google), perhaps at least listing the sites somewhere
centrally viewable?


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] StringFunctions/ParserFunctions #pos return value changed

2009-06-04 Thread H. Langos
Seems like (at least) the API of #pos in ParserFunctions is
different from the one in StringFunctions.

{{#pos: haysack|needle|offset}}

While the StringFunctions #pos in MediaWiki 1.14 returned an
empty string when the needle was not found, the ParserFunctions
implementation of #pos in svn now returns -1.

This is most unfortunate since current usage depends on this.
Example:

{{#if: {{#pos: abcd|b}} | found | not found }}

{{#if: {{#pos: abcd|x}} | found | not found }}

Now both of these example will return "found"!


Usage scenario:

I try to use #pos in template calls to implement a sort-of-database
functionality in a mediawiki.

I have a big template that contains data in named parameters.
those parameters get passed along to a template that can select "columns"
by rendering some of those named parameters and ignoring others.

Now I want to implement "row selection" by passing along a parameter name
and a substring that should be in the value of that parameter in order
for the data to be rendered.

something like this:

{{#if: {{#pos: {{{ {{{selectionattribute}}} }}} | {{{selectionvalue}}} }} | 
render_row | render_nothing }}

If I want this to work in different MediaWiki installations I need
to rely on the API of #pos.

Currently there is seems to be no way to use #pos in a way that works 
on 1.14 and on 1.15-svn.

cheers
-henrik


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l