[Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
Seems like (at least) the API of #pos in ParserFunctions is different from the one in StringFunctions. {{#pos: haysack|needle|offset}} While the StringFunctions #pos in MediaWiki 1.14 returned an empty string when the needle was not found, the ParserFunctions implementation of #pos in svn now returns -1. This is most unfortunate since current usage depends on this. Example: {{#if: {{#pos: abcd|b}} | found | not found }} {{#if: {{#pos: abcd|x}} | found | not found }} Now both of these example will return found! Usage scenario: I try to use #pos in template calls to implement a sort-of-database functionality in a mediawiki. I have a big template that contains data in named parameters. those parameters get passed along to a template that can select columns by rendering some of those named parameters and ignoring others. Now I want to implement row selection by passing along a parameter name and a substring that should be in the value of that parameter in order for the data to be rendered. something like this: {{#if: {{#pos: {{{ {{{selectionattribute}}} }}} | {{{selectionvalue}}} }} | render_row | render_nothing }} If I want this to work in different MediaWiki installations I need to rely on the API of #pos. Currently there is seems to be no way to use #pos in a way that works on 1.14 and on 1.15-svn. cheers -henrik ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
David Gerard schrieb: Keeping well-meaning admins from putting Google web bugs in the JavaScript is a game of whack-a-mole. Are there any technical workarounds feasible? If not blocking the loading of external sites entirely (I understand hu:wp uses a web bug that isn't Google), perhaps at least listing the sites somewhere centrally viewable? Perhaps the solution would be to simply set up our own JS based usage tracker? There are a few options available http://en.wikipedia.org/wiki/List_of_web_analytics_software, and for starters, the backend could run on the toolserver. Note that anything processing IP addresses will need special approval on the TS. -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote: Seems like (at least) the API of #pos in ParserFunctions is different from the one in StringFunctions. {{#pos: haysack|needle|offset}} While the StringFunctions #pos in MediaWiki 1.14 returned an empty string when the needle was not found, the ParserFunctions implementation of #pos in svn now returns -1. I forgot to ask THE question. Is it a bug or is there some good reason to break backward compatibility? And no, programming language cosmetics is not a good reason. :-) If something has the same interface, it should have the same behaviour. If the old semantics was too awful to bare, the new one should have been called #strpos or #fpos (for forward-#pos. #rpos always had the -1 return on no-found behaviour). cheers -henrik ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
I suggest keep the bug on Wikimedia's servers and using a tool which relies on SQL databases. These could be shared with the toolserver where the official version of the analysis tool runs and users are enabled to run their own queries (so taking a tool with a good database structure would be nice). With that the toolserver users could set up their own cool tools on that data. On Thu, Jun 4, 2009 at 4:34 PM, David Gerard dger...@gmail.com wrote: 2009/6/4 Daniel Kinzler dan...@brightbyte.de: David Gerard schrieb: Keeping well-meaning admins from putting Google web bugs in the JavaScript is a game of whack-a-mole. Are there any technical workarounds feasible? If not blocking the Perhaps the solution would be to simply set up our own JS based usage tracker? There are a few options available http://en.wikipedia.org/wiki/List_of_web_analytics_software, and for starters, the backend could run on the toolserver. Note that anything processing IP addresses will need special approval on the TS. If putting that on the toolserver passes privacy policy muster, that'd be an excellent solution. Then external site loading can be blocked. (And if the toolservers won't melt in the process.) - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, Jun 4, 2009 at 10:19 AM, David Gerard dger...@gmail.com wrote: Keeping well-meaning admins from putting Google web bugs in the JavaScript is a game of whack-a-mole. Are there any technical workarounds feasible? If not blocking the loading of external sites entirely (I understand hu:wp uses a web bug that isn't Google), perhaps at least listing the sites somewhere centrally viewable? Restrict site-wide JS and raw HTML injection to a smaller subset of users who have been specifically schooled in these issues. This approach is also compatible with other approaches. It has the advantage of being simple to implement and should produce a considerable reduction in problems regardless of the underlying cause. Just be glad no one has yet turned english wikipedia's readers into their own personal DDOS drone network. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
Michael Rosenthal wrote: I suggest keep the bug on Wikimedia's servers and using a tool which relies on SQL databases. These could be shared with the toolserver where the official version of the analysis tool runs and users are enabled to run their own queries (so taking a tool with a good database structure would be nice). With that the toolserver users could set up their own cool tools on that data. If Javascript was used to serve the bug, it would be quite easy to only load the bug some small fraction of the time, allowing a fair statistical sample of JS-enabled readers (who should, I hope, be fairly representative of the whole population) to be taken without melting down the servers. I suspect the fact that most bots and spiders do not interpret Javascript, and would thus be excluded from participating in the traffic survey, could be regarded as an added bonus. -- Neil ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
2009/6/4 Mike.lifeguard mikelifegu...@fastmail.fm: On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote: Then external site loading can be blocked. Why do we need to block loading from all external sites? If there are specific problematic ones (like google analytics) then why not block those? Because having the data go outside Wikimedia at all is a privacy policy violation, as I understand it (please correct me if I'm wrong). - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, Jun 4, 2009 at 10:53 AM, David Gerard dger...@gmail.com wrote: I understand the problem with stats before was that the stats server would melt under the load. Leon's old wikistats page sampled 1:1000. The current stats (on dammit.lt and served up nicely on http://stats.grok.se) are every hit, but I understand (Domas?) that it was quite a bit of work to get the firehose of data in such a form as not to melt the receiving server trying to process it. OK, then the problem becomes: how to set up something like stats.grok.se feasibly internally for all the other data gathered from a hit? (Modulo stuff that needs to be blanked per privacy policy.) What exactly are people looking for that isn't available from stats.grok.se that isn't a privacy concern? I had assumed that people kept installing these bugs because they wanted source network break downs per-article and other clear privacy violations. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, Jun 4, 2009 at 11:01 AM, Mike.lifeguard mikelifegu...@fastmail.fm wrote: On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote: Then external site loading can be blocked. Why do we need to block loading from all external sites? If there are specific problematic ones (like google analytics) then why not block those? Because: (1) External loading results in an uncontrolled leak of private reader and editor information to third parties, in contravention of the privacy policy as well as basic ethical operating principles. (1a) most external loading script usage will also defeat users choice of SSL and leak more information about their browsing to their local network. It may also bypass any wikipedia specific anonymization proxies they are using to keep their reading habits private. (2) External loading produces a runtime dependency on third party sites. Some other site goes down and our users experience some kind of loss of service. (3) The availability of external loading makes Wikimedia a potential source of very significant DDOS attacks, intentional or otherwise. Thats not to say that there aren't reasons to use remote loading, but the potential harms mean that it should probably be a default-deny permit-by-exception process rather than the other way around. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
David Gerard schrieb: 2009/6/4 Gregory Maxwell gmaxw...@gmail.com: Restrict site-wide JS and raw HTML injection to a smaller subset of users who have been specifically schooled in these issues. Is it feasible to allow admins to use raw HTML as appropriate but not raw JS? Being able to fix MediaWiki: space messages with raw HTML is way too useful on the occasions where it's useful. Possible yes, sensible no. Because if you can edit raw html, you can inject javascript. -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Foundation-l] Wikipedia tracks user behaviour via third party companies
David Gerard wrote: Web bugs for statistical data are a legitimate want but potentially a horrible privacy violation. So I asked on wikitech-l, and the obvious answer appears to be to do it internally. Something like http://stats.grok.se/ only more so. So - if you want web bug data in a way that fits the privacy policy, please pop over to the wikitech-l thread with technical suggestions and solutions :-) - d. Yes, modifying the http://stats.grok.se/ systems looks like the way to go. What do people actually want to see from the traffic data? Do they want referrers, anonymized user trails, or what? -- Neil ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
David Gerard schrieb: 2009/6/4 Mike.lifeguard mikelifegu...@fastmail.fm: On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote: Then external site loading can be blocked. Why do we need to block loading from all external sites? If there are specific problematic ones (like google analytics) then why not block those? Because having the data go outside Wikimedia at all is a privacy policy violation, as I understand it (please correct me if I'm wrong). I agree with that, *especially* if it's for the purpose of aggregating data about users. -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, Jun 4, 2009 at 17:00, Gregory Maxwell gmaxw...@gmail.com wrote: On Thu, Jun 4, 2009 at 10:53 AM, David Gerard dger...@gmail.com wrote: I understand the problem with stats before was that the stats server would melt under the load. Leon's old wikistats page sampled 1:1000. The current stats (on dammit.lt and served up nicely on http://stats.grok.se) are every hit, but I understand (Domas?) that it was quite a bit of work to get the firehose of data in such a form as not to melt the receiving server trying to process it. OK, then the problem becomes: how to set up something like stats.grok.se feasibly internally for all the other data gathered from a hit? (Modulo stuff that needs to be blanked per privacy policy.) What exactly are people looking for that isn't available from stats.grok.se that isn't a privacy concern? I had assumed that people kept installing these bugs because they wanted source network break downs per-article and other clear privacy violations. On top of views/page I'd be interested in keywords used, entryexit points, path analysis when people are editing (do they save/leave/try to find help/...) #edit starts, #submitted edits that don't get saved. henna -- Maybe you knew early on that your track went from point A to B, but unlike you I wasn't given a map at birth! Alyssa, Chasing Amy ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
Neil Harris wrote: Daniel Kinzler wrote: David Gerard schrieb: 2009/6/4 Gregory Maxwell gmaxw...@gmail.com: Restrict site-wide JS and raw HTML injection to a smaller subset of users who have been specifically schooled in these issues. Is it feasible to allow admins to use raw HTML as appropriate but not raw JS? Being able to fix MediaWiki: space messages with raw HTML is way too useful on the occasions where it's useful. Possible yes, sensible no. Because if you can edit raw html, you can inject javascript. -- daniel Not if you sanitize the HTML after the fact: just cleaning out script tags and elements from the HTML stream should do the job. After this has been done to the user-generated content, the desired locked-down script code can then be inserted at the final stages of page generation. -- Neil Come to think of it, you could also allow the carefully vetted loading of scripts from a very limited whitelist of Wikimedia-hosted and controlled domains and paths, when performing that sanitization. Inline scripts remain a bad idea: there are just too many ways to obfuscate them and/or inject data into them to have any practical prospect of limiting them to safe features without heroic efforts. However; writing a javascript sanitizer that restricted the user to a safe subset of the language, by first parsing and then resynthesizing the code using formal methods for validation, in a way similar to the current solution for TeX, would be an interesting project! -- Neil ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On 04/06/2009, at 4:08 PM, Daniel Kinzler wrote: David Gerard schrieb: 2009/6/4 Gregory Maxwell gmaxw...@gmail.com: Restrict site-wide JS and raw HTML injection to a smaller subset of users who have been specifically schooled in these issues. Is it feasible to allow admins to use raw HTML as appropriate but not raw JS? Being able to fix MediaWiki: space messages with raw HTML is way too useful on the occasions where it's useful. Possible yes, sensible no. Because if you can edit raw html, you can inject javascript. When did we start treating our administrators as potentially malicious attackers? Any administrator could, in theory, add a cookie-stealing script to my user JS, steal my account, and grant themselves any rights they please. We trust our administrators. If we don't, we should move the editinterface right further up the chain. -- Andrew Garrett Contract Developer, Wikimedia Foundation agarr...@wikimedia.org http://werdn.us ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
Thanks, that clarifies matters for me. I wasn't aware of #1, though I guess upon reflection that makes sense. -Mike On Thu, 2009-06-04 at 11:07 -0400, Gregory Maxwell wrote: On Thu, Jun 4, 2009 at 11:01 AM, Mike.lifeguard mikelifegu...@fastmail.fm wrote: On Thu, 2009-06-04 at 15:34 +0100, David Gerard wrote: Then external site loading can be blocked. Why do we need to block loading from all external sites? If there are specific problematic ones (like google analytics) then why not block those? Because: (1) External loading results in an uncontrolled leak of private reader and editor information to third parties, in contravention of the privacy policy as well as basic ethical operating principles. (1a) most external loading script usage will also defeat users choice of SSL and leak more information about their browsing to their local network. It may also bypass any wikipedia specific anonymization proxies they are using to keep their reading habits private. (2) External loading produces a runtime dependency on third party sites. Some other site goes down and our users experience some kind of loss of service. (3) The availability of external loading makes Wikimedia a potential source of very significant DDOS attacks, intentional or otherwise. Thats not to say that there aren't reasons to use remote loading, but the potential harms mean that it should probably be a default-deny permit-by-exception process rather than the other way around. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 4, 2009 at 6:55 AM, H. Langos henrik...@prak.org wrote: Seems like (at least) the API of #pos in ParserFunctions is different from the one in StringFunctions. {{#pos: haysack|needle|offset}} While the StringFunctions #pos in MediaWiki 1.14 returned an empty string when the needle was not found, the ParserFunctions implementation of #pos in svn now returns -1. snip Prior to the merge 100% of the StringFunction function calls were reimplemented, principally for performance and security reasons. The short but uninspired answer to your question is that in doing that I didn't notice that #pos and #rpos had different default behavior. Given the way that #if works, returning empty string is a reasonable response to a string-not-found condition, and I am happy to change that back. I'll also recheck to make sure there aren't any other unexpected behavioral changes. Though they don't have to have the same behavior, I'd be inclined to argue that #pos and #rpos really ought to have the same default behavior on usability grounds, i.e. either both giving -1 or both giving empty string when a match is not found. Though since that does create compatibility issues with existing StringFunctions users, I'll defer to others about whether consistency would be a good enough motivation in this case. I should warn you though that there is an intentional behavioral change regarding the handling of strip markers. The pre-existing StringFunctions codebase reacted to strip markers in a way that was inefficient, hard for the end user to predict, and in specially crafted cases created security issues. The following example is illustrative of the change. Consider the string ABCnowikijkl/nowikiDEFnowikimno/nowikiGHI In the new implementation this is treated internally as ABCDEFGHI by the string routines. Hence it's length is 9 and it's first five characters are ABCDE. For complicated reasons the StringFunctions version says its length is 7 and the first five characters are ABCjklDEFmnoG. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 04, 2009 at 05:05:50PM +0100, Andrew Garrett wrote: On 04/06/2009, at 3:46 PM, H. Langos wrote: On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote: Seems like (at least) the API of #pos in ParserFunctions is different from the one in StringFunctions. {{#pos: haysack|needle|offset}} While the StringFunctions #pos in MediaWiki 1.14 returned an empty string when the needle was not found, the ParserFunctions implementation of #pos in svn now returns -1. I forgot to ask THE question. Is it a bug or is there some good reason to break backward compatibility? And no, programming language cosmetics is not a good reason. :-) If something has the same interface, it should have the same behaviour. If the old semantics was too awful to bare, the new one should have been called #strpos or #fpos (for forward-#pos. #rpos always had the -1 return on no-found behaviour). This should be left as a comment on the relevant revision in CodeReview. Note that it's likely irrelevant anyway, as, in all likelihood, the merge of String and Parser Functions will be reverted. Sorry to bother you but I am not a wikimedia developer so I wouldn't know where to start looking. Could you point me to the right place/list/article? The svn revision with the String and Parser Functions merge was 50997. cheers -henrik ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 4, 2009 at 9:05 AM, Andrew Garrett agarr...@wikimedia.org wrote: On 04/06/2009, at 3:46 PM, H. Langos wrote: On Thu, Jun 04, 2009 at 03:55:38PM +0200, H. Langos wrote: Seems like (at least) the API of #pos in ParserFunctions is different from the one in StringFunctions. {{#pos: haysack|needle|offset}} While the StringFunctions #pos in MediaWiki 1.14 returned an empty string when the needle was not found, the ParserFunctions implementation of #pos in svn now returns -1. I forgot to ask THE question. Is it a bug or is there some good reason to break backward compatibility? And no, programming language cosmetics is not a good reason. :-) If something has the same interface, it should have the same behaviour. If the old semantics was too awful to bare, the new one should have been called #strpos or #fpos (for forward-#pos. #rpos always had the -1 return on no-found behaviour). This should be left as a comment on the relevant revision in CodeReview. Note that it's likely irrelevant anyway, as, in all likelihood, the merge of String and Parser Functions will be reverted. Two devs, who shall remain nameless unless they choose to take credit for it, explicitly encouraged the merge. Personally, I've always thought it made more sense to keep these as separate extensions but I went along with what they encouraged me to do. Regardless of whether it is one extension or two, I do strongly feel that once a technically acceptable implementation of string functions exists then it should be enabled on WMF sites. (I agree though that the previous StringFunctions was rightly excluded due to implementation problems.) -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, 2009-06-04 at 17:04 +0100, Andrew Garrett wrote: When did we start treating our administrators as potentially malicious attackers? Any administrator could, in theory, add a cookie-stealing script to my user JS, steal my account, and grant themselves any rights they please. We trust our administrators. If we don't, we should move the editinterface right further up the chain. They are potentially malicious attackers, but we nevertheless trust them not to do bad things. We in this case refers only to most of Wikimedia, I guess, since there has been no shortage of paranoia both on bugzilla and this list recently - a sad state of affairs to be sure. -Mike ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
How does installing 3rd party analytics software help the WMF accomplish its goals? On Thu, Jun 4, 2009 at 8:31 AM, Daniel Kinzler dan...@brightbyte.de wrote: David Gerard schrieb: Keeping well-meaning admins from putting Google web bugs in the JavaScript is a game of whack-a-mole. Are there any technical workarounds feasible? If not blocking the loading of external sites entirely (I understand hu:wp uses a web bug that isn't Google), perhaps at least listing the sites somewhere centrally viewable? Perhaps the solution would be to simply set up our own JS based usage tracker? There are a few options available http://en.wikipedia.org/wiki/List_of_web_analytics_software, and for starters, the backend could run on the toolserver. Note that anything processing IP addresses will need special approval on the TS. -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 4, 2009 at 12:05 PM, Andrew Garrettagarr...@wikimedia.org wrote: Note that it's likely irrelevant anyway, as, in all likelihood, the merge of String and Parser Functions will be reverted. Have Tim or Brion said this? https://bugzilla.wikimedia.org/show_bug.cgi?id=6455#c36 is the only clear statement I've seen by either of them that I can recall. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
2009/6/4 Finne Boonen hen...@gmail.com: On Thu, Jun 4, 2009 at 17:00, Gregory Maxwell gmaxw...@gmail.com wrote: What exactly are people looking for that isn't available from stats.grok.se that isn't a privacy concern? I had assumed that people kept installing these bugs because they wanted source network break downs per-article and other clear privacy violations. On top of views/page I'd be interested in keywords used, entryexit points, path analysis when people are editing (do they save/leave/try to find help/...) #edit starts, #submitted edits that don't get saved. Path analysis is a big one. All that other stuff, if it won't violate privacy, would be fantastically useful to researchers, internal and external, in ways we won't have even thought of yet, and help us considerably to improve the projects. (This would have to be given considerable thought from a security/hacker mindset - e.g. even with IPs stripped, listing user pages and user page edits would likely give away an identity. Talk pages may do the same. Those are just off the top of my head, I'm sure someone has already made a list of what they could work out even with IPs anonymised or even stripped.) - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
2009/6/4 Andrew Garrett agarr...@wikimedia.org: When did we start treating our administrators as potentially malicious attackers? Any administrator could, in theory, add a cookie-stealing script to my user JS, steal my account, and grant themselves any rights they please. That's why I started this thread talking about things being done right now by well-meaning admins :-) - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, Jun 4, 2009 at 11:56 AM, Neil Harrisuse...@tonal.clara.co.uk wrote: However; writing a javascript sanitizer that restricted the user to a safe subset of the language, by first parsing and then resynthesizing the code using formal methods for validation, in a way similar to the current solution for TeX, would be an interesting project! Interesting, but probably not very useful. If we restricted JavaScript the way we restricted TeX, we'd have to ban function definitions, loops, conditionals, and most function calls. I suspect you'd have to make it pretty much unusable to make output of specific strings impossible. On Thu, Jun 4, 2009 at 12:45 PM, Gregory Maxwellgmaxw...@gmail.com wrote: Regarding HTML sanitation: Raw HTML alone without JS is enough to violate users privacy: Just add a hidden image tag to a remote site. Yes you could sanitize out various bad things, but then thats not raw HTML anymore, is it? It might be good enough for the purposes at hand, though. What are the use-cases for wanting raw HTML in messages, instead of wikitext or plaintext? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
2009/6/4 Brian brian.min...@colorado.edu: How does installing 3rd party analytics software help the WMF accomplish its goals? Detailed analysis of how users actually use the site would be vastly useful in improving the sites' content and usability. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
On Thu, Jun 4, 2009 at 2:32 PM, David Gerard dger...@gmail.com wrote: 2009/6/4 Gregory Maxwell gmaxw...@gmail.com: I think the biggest problem to reducing accesses is that far more mediawiki messages are uncooked than is needed. Were it not for this I expect this access would have been curtailed somewhat a long time ago. I think you've hit the actual problem there. Someone with too much time on their hands who could go through all of the MediaWiki: space to see what really needs to be HTML rather than wikitext? - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l See bug 212[1], which is (sort of) a tracker for the wikitext-ification of the messages. -Chad [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=212 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 4, 2009 at 2:29 PM, Brianbrian.min...@colorado.edu wrote: I was privy to a #mediawiki conversation between brion/tim where tim pointed out that at least one person plans to implement a Natural Language Processing parser for English using StringFunctions just as soon as they are enabled. It's pretty obvious that you can implement all sorts crazy algorithms using StringFunctions. They need to be limited so that is not possible. Note, though, that there are some that are already possible to some extent. You can use the core padright/padleft functions to emulate a couple of the added functions. E.g.: http://en.wikipedia.org/w/index.php?title=Template:Str_lenaction=edit The most template-heavy pages already tend to run close to the template limits, until they're cut down by users when they fail. It's not clear to me that allowing more functions would actually increase overall load or template complexity significantly. It might decrease it by allowing simpler and more efficient implementations of things that currently need to be worked around. It can't really increase it too much, theoretically -- that's what the template limits are for. Werdna points out that Tim did say this morning in #mediawiki that he'd probably revert the change. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 4, 2009 at 11:29 AM, Brian brian.min...@colorado.edu wrote: I was privy to a #mediawiki conversation between brion/tim where tim pointed out that at least one person plans to implement a Natural Language Processing parser for English using StringFunctions just as soon as they are enabled. It's pretty obvious that you can implement all sorts crazy algorithms using StringFunctions. They need to be limited so that is not possible. If you are referring to the conversation I think you are, then my impression was Tim was speaking hypothetically about the issue rather than knowing someone that had this specific intent. I'm fairly dubious about anyone actually trying natural language processing to any serious degree. Real natural language processing needs huge lookup tables to identify part of speech and relationships etc. Technically possible I suppose, but not easy to do. I'm even more dubious that full fledged natural language processing -- in templates -- would find significant uses. It is more efficient and more practical to view templates as simple formatting macros rather than as a system for real natural language interaction. There are very useful things that can be done with simple string algorithms, such as detecting the (bar) when given a title like Foo (bar), but I wouldn't expect anyone to be answering queries with them or anything like that. When providing tools to content creators, flexibility is generally a positive design feature. We shouldn't go overboard with imposing limits in the advance of actual problems. The current implementation is artificially limited to 1000 characters or less, which does prevent huge manipulations, however. -Robert Rohde ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On Thu, Jun 4, 2009 at 11:52 AM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: Note, though, that there are some that are already possible to some extent. You can use the core padright/padleft functions to emulate a couple of the added functions. E.g.: http://en.wikipedia.org/w/index.php?title=Template:Str_lenaction=edit snip I would like to note for the record that Brion explicitly endorsed the padleft hack to the degree that he re-enabled it after Werdna had removed it. [1] Maybe he'd change his mind after looking at how the string manipulation templates are actually getting used (now in 20,000 enwiki pages and counting), but for the moment he seems to have supported allowing some form of hacked together string manipulation system into Mediawiki. To that end it makes more sense to have a real string implementation rather than the ridiculous templates we have now. -Robert Rohde [1] http://svn.wikimedia.org/viewvc/mediawiki?view=revrevision=47411 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Internal links and diacritics
Hi, I'm trying to format a link like this: [[musulman]]ă. On ro.wp, this is equivalent to [[musulman|musulman]]ă (the special letter is not included in the wiki link. While going through http://www.mediawiki.org/wiki/Markup_spec I saw that: internal-link ::= internal-link-start article-link [ # section-id ] [ pipe [link-description] ] internal-link-end [extra-description] extra-description ::= letter [extra-description] letter::= ucase-letter | lcase-letter ucase-letter ::= A | B | ... | Y | Z lcase-letter ::= a | b | ... | y | z This tells me that only ASCII letters are used for this type of linking. However, on fr.wp I can write [[Ren]]é and this is equivalent to [[Ren|René]]. How was this made? Is it something that can be set by from a page or should some php be changed? Thanks, Strainu ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Internal links and diacritics
You have to use the MediaWiki:Linktrail page, for example: http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on fr.wiki). D. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Internal links and diacritics
You have to use the MediaWiki:Linktrail page, for example: http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on fr.wiki). AFAIK, it has to be set in the language file thru $linkTrail variable, because it looks like that MediaWiki:Linktrail is no longer used. On Thu, Jun 4, 2009 at 10:38 PM, Tar Dániel bdane...@gmail.com wrote: You have to use the MediaWiki:Linktrail page, for example: http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on fr.wiki). D. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Internal links and diacritics
2009/6/4 Strainu strain...@gmail.com: While going through http://www.mediawiki.org/wiki/Markup_spec I saw that: internal-link ::= internal-link-start article-link [ # section-id ] [ pipe [link-description] ] internal-link-end [extra-description] extra-description ::= letter [extra-description] letter ::= ucase-letter | lcase-letter ucase-letter ::= A | B | ... | Y | Z lcase-letter ::= a | b | ... | y | z This tells me that only ASCII letters are used for this type of linking. It's wrong. Don't trust that page too much. It was written after the fact to try to document the parser, not something the parser was designed to follow. It's almost certainly wrong in a lot of corner cases. (Like non-English languages, apparently.) On Thu, Jun 4, 2009 at 3:53 PM, Ahmad Sherifahmad.m.she...@gmail.com wrote: AFAIK, it has to be set in the language file thru $linkTrail variable, because it looks like that MediaWiki:Linktrail is no longer used. Correct. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Internal links and diacritics
2009/6/4 Strainu strain...@gmail.com: Hi, I'm trying to format a link like this: [[musulman]]ă. On ro.wp, this is equivalent to [[musulman|musulman]]ă (the special letter is not included in the wiki link. While going through http://www.mediawiki.org/wiki/Markup_spec I saw that: internal-link ::= internal-link-start article-link [ # section-id ] [ pipe [link-description] ] internal-link-end [extra-description] extra-description ::= letter [extra-description] letter ::= ucase-letter | lcase-letter ucase-letter ::= A | B | ... | Y | Z lcase-letter ::= a | b | ... | y | z This tells me that only ASCII letters are used for this type of linking. However, on fr.wp I can write [[Ren]]é and this is equivalent to [[Ren|René]]. How was this made? Is it something that can be set by from a page or should some php be changed? The set of characters allowed in the so-called linktrail depends on the language used, and is set in the individual LanguageXx.php files. Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Internal links and diacritics
On Thu, Jun 4, 2009 at 10:53 PM, Ahmad Sherif ahmad.m.she...@gmail.com wrote: You have to use the MediaWiki:Linktrail page, for example: http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on fr.wiki). AFAIK, it has to be set in the language file thru $linkTrail variable, because it looks like that MediaWiki:Linktrail is no longer used. On Thu, Jun 4, 2009 at 10:38 PM, Tar Dániel bdane...@gmail.com wrote: You have to use the MediaWiki:Linktrail page, for example: http://hu.wikipedia.org/wiki/MediaWiki:Linktrail (or see the same page on fr.wiki). D. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l Yep, I started from there and got to http://meta.wikimedia.org/wiki/MediaWiki_talk:Linktrail It suddenly became all clear :) Thank you all for your responses. Strainu ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions/ParserFunctions #pos return value changed
On 04/06/2009, at 8:03 PM, Robert Rohde wrote: I would like to note for the record that Brion explicitly endorsed the padleft hack to the degree that he re-enabled it after Werdna had removed it. [1] Maybe he'd change his mind after looking at how the string manipulation templates are actually getting used (now in 20,000 enwiki pages and counting), but for the moment he seems to have supported allowing some form of hacked together string manipulation system into Mediawiki. To that end it makes more sense to have a real string implementation rather than the ridiculous templates we have now. I wouldn't read that into it. I think it's better characterised as reverting attempts to create an arms race over the hacks. -- Andrew Garrett Contract Developer, Wikimedia Foundation agarr...@wikimedia.org http://werdn.us ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
That's why WMF now has a usability lab. On Thu, Jun 4, 2009 at 12:34 PM, David Gerard dger...@gmail.com wrote: 2009/6/4 Brian brian.min...@colorado.edu: How does installing 3rd party analytics software help the WMF accomplish its goals? Detailed analysis of how users actually use the site would be vastly useful in improving the sites' content and usability. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google web bugs in Mediawiki js from admins - technical workarounds?
2009/6/4 Brian brian.min...@colorado.edu: That's why WMF now has a usability lab. Yep. They'd dive on this stuff with great glee if we can implement it without breaking privacy or melting servers. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] firefogg local encode new-upload branch update.
As you may know I have been working on firefogg integration with mediaWiki. As you may also know the mwEmbed library is being designed to support embedding of these interfaces in arbitrary external contexts. I wanted to quickly highlight a useful stand alone usage example of the library: http://www.firefogg.org/make/advanced.html This Make Ogg link will be something you can send to a person so they can encode source footage to a local ogg video file with the latest and greatest ogg encoders (presently the thusnelda theora encoder vorbis audio). Updates to thusnelda and other free codecs will be pushed out via firefogg updates. For commons / wikimedia usage we will directly integrate firefogg (using that same codebase) You can see an example of how that works on the 'new-upload' branch here: http://sandbox.kaltura.com/testwiki/index.php/Special:Upload ... hopefully we will start putting some of this on testing.wikipedia.org ~soonish ?~ The new-upload branch feature set is quite extensive including the script-loader, jquery javascript refactoring, the new upload-api, new mv_embed video player, add media wizard etc. Any feedback and specific bug reports people can do will be super helpful in gearing up for merging this 'new-upload' branch. For an overview see: http://www.mediawiki.org/wiki/Media_Projects_Overview peace, --michael ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l