Re: [Wikitech-l] REMINDER: No deploys next week (week of Sept 26th)
Hurra! Alex 2016-09-20 19:09 GMT+02:00 Greg Grossmeier: > Due to the Wikimedia Technical Operations Team having their team offsite > that week and generally being less than normally available there will be > no non-emergency deploys the week of September 26th (aka: next week). > > See also: > https://wikitech.wikimedia.org/wiki/Deployments#Week_of_September_26th > > Normal schedule will resume the following week. > > Teaser: The week of October 17th will be a "no train deploys" weeks as > the Wikimedia Release Engineering Team will be at their offsite that > week. I'll send a reminder about this as well closer to that date. > > Best, > > Greg > > -- > | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E | > | Release Team ManagerA18D 1138 8E47 FAC8 1C7D | > > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Frustration about Canvas
Ok, Ricordisamoa, it runs, thanks for suggestion and your boldness while reading my DIY js code ;-) ! Alex 2015-01-08 0:33 GMT+01:00 Ricordisamoa ricordisa...@openmailbox.org: It's about https://it.wikisource.org/wiki/MediaWiki:Gadget-cornersAlpha.js, isn't it? This change https://it.wikisource.org/w/index.php?diff=1499050oldid= 1498642 should do the job. Il 07/01/2015 10:21, Alex Brollo ha scritto: While dragging a little bit into canvas, I successfully upload into a canvas a cropped clip of the image of a djvu page into it.source, just to crash into a DOM exception The canvas has been tainted by cross-origin data while attempting do access to pixel data with both getImageData() and toDataURL() methods. Again, it seems a CORS issue. Am I wrong? Is there some doc about this issue? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Frustration about Canvas
While dragging a little bit into canvas, I successfully upload into a canvas a cropped clip of the image of a djvu page into it.source, just to crash into a DOM exception The canvas has been tainted by cross-origin data while attempting do access to pixel data with both getImageData() and toDataURL() methods. Again, it seems a CORS issue. Am I wrong? Is there some doc about this issue? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making a plain MW core git clone not be installable
I'm not a developer so it's perfectly normal that I can't understand anything about your talk; nevertheless, please, remember of KISS principle when building any installing tools for poor, final users. I'm waiting for something like pip install core. Alex 2014-06-11 15:58 GMT+02:00 C. Scott Ananian canan...@wikimedia.org: I will mention that any solution short of sucking in the third party dependencies into the main repo (not as a submodule) -- which no one wants to do anyway -- will be slightly awkward to git bisect. Not impossible; the pain is about the same for both main options: a) in theory git-bisect should adjust submodules to the correct hash. in practice you need to run `git submodule update` after every step in order to check out the appropriate submodule commits. b) similarly, for composer you need to run a command to update the 3rd party packages, if any dependencies have changed. (for node.js projects, which have similar issues, you need to run `npm install` after every step. For regressions that are easily found by running a test suite, you can arrange for `git bisect run` to do the appropriate git, composer, or npm command before running the test. So, it's somewhat awkward, but manageable. And usually the 3rd-party dependencies don't typically change as often as the core code, so this doesn't come up all that often. Two main benefits of `git submodule`: (1) perhaps one day `git bisect` and `git submodule` will play nicer together; (2) since references are to a git hash, crawling through history is repeatable. One disadvantages: (1) because git allows `.gitmodules` to differ from your local set of module repo sources (in `.git/config`), it is rather too easy to forget to push a submodule commit referenced from the main repo -- although hopefully jenkins will catch errors of this sort, The main disadvantage of using `composer`/`npm`/etc directly is that you are at the mercy of the upstream package repo to behave well. That is, if the upstream repo allows uploaders to change the tarball associated with version X.Y.Z, you may find it hard to reproduce past configurations. Similarly, if you specify loose package version constraints, you are at the mercy of the upstream maintainers to actually maintain compatibility between minor versions or what-have-you. Parsoid and some other WMF projects actually use a hybrid approach where we use `npm` and not submodules, but we maintain a separate deploy repo which combines the main code base (as a submodule) and specific versions of the third party code. --scott ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How to ask for a python package into Labs
OK, done 2014-02-23 7:53 GMT+01:00 K. Peachey p858sn...@gmail.com: bugzilla. On 23 February 2014 16:51, Alex Brollo alex.bro...@gmail.com wrote: I'd need internetarchive python package into Labs: https://pypi.python.org/pypi/internetarchive , a python bot for Internet Archive. I cant't find how to ask for installing it into Labs. Can you help me? It's an interesting package - it can be implemented into a pywikibot and manage both mediawiki pages and Internet Archive items both reading and editing metadata and uploading new items/pages. I've been encouraged to go on by Tpt. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] How to ask for a python package into Labs
I'd need internetarchive python package into Labs: https://pypi.python.org/pypi/internetarchive , a python bot for Internet Archive. I cant't find how to ask for installing it into Labs. Can you help me? It's an interesting package - it can be implemented into a pywikibot and manage both mediawiki pages and Internet Archive items both reading and editing metadata and uploading new items/pages. I've been encouraged to go on by Tpt. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Non-Violent Communication
While browsing the web for new trends of human-horse communication horse management, I found the website of Marjorie Smith, and I've been deeply influenced by her; her thoughts about links between man-to-man and man-to-horse communication - really an example of advantages of NVC - were extremely interesting and inspiring. I don't know why she removed her website from the web, but I saved a copy of it into my own website, with an Italian translation (with Marjorie permission) but - luckily - with original English front-text. You can find it here: http://www.alexbrollo.com/people-for-peace/ If yoi like horses and peace, it's a very interesting text. It points attention on fear, and to how fighting against fear is important for NVC. Alex 2014-02-18 4:33 GMT+01:00 Isarra Yos zhoris...@gmail.com: If you're pissed, that's when you use something like NVC, except taking it even further, perhaps. Put other people on edge too, but then if they do anything about it, wll... I think this may be the standard approach on a lot of discussion boards on enwp. On 18/02/14 03:26, Adam Wight wrote: Interesting... I have very little authority to stand on, but in my exposure to so-called NVC, it seems more appropriate for diplomatic negotiations than for any real-life human situation. IMO this approach boils down to getting your way without looking like a dick. Creeps me out. That said, yes it's important to always deal generously with others. Unless you're pissed :p love, Adam On Mon, Feb 17, 2014 at 3:14 PM, Derk-Jan Hartman d.j.hartman+wmf...@gmail.com wrote: On 17 feb. 2014, at 21:45, Monte Hurd mh...@wikimedia.org wrote: +1 When I read certain threads on this list, I feel like the assume good faith principle is often forgotten. Because this behavior makes me not want to participate in discussions about issues I actually care about, I wonder how many other voices, like mine, aren't heard, and to what degree this undermines any eventual perceived consensus? To be sure, if you don't assume good faith, your opinion still matters, but you unnecessarily weaken both your argument and the discussion. +many Yes on this list we have some strong opinions and we aren't always particularly careful about how we express them, but assume good faith[1] does indeed go a long way and that should be the default mode for reading. The default mode for writing should of course be don't be a dick [2]. We have to remember that although many people are well versed in English here, it is often not their mother tongue, making it more difficult to understand the subtleties of the opinions of others and/or to express theirs, which might lead to frustration for both sides. And some people are simply terse where others are blunt and some people have more time than others to create replies or to wait for someones attempts to explain something properly. Being inclusive for this reason is usually regarded as a good thing and is thus a natural part of assume good faith. It is why 'civility' often is so difficult too map directly to community standards, because it is too tightly coupled with ones own norms, values and skills to be inclusive. I'm personally good with almost anything that keeps a good distance from both Linus Torvalds-style and NVC. We shouldn't be afraid to point out errors or have hefty discussions and we need to keep it inside the lines where people will want to participate. But this is no kindergarten either and some of the more abrasive postings have made a positive difference. It's difficult to strike the right balance but it's good to ask people once in a while to pay attention to how we communicate. DJ [1] https://meta.wikimedia.org/wiki/Assume_good_faith [2] https://meta.wikimedia.org/wiki/Don%27t_be_a_dick PS. Because this behavior makes me not want to participate in discussions about issues I actually care about, I wonder how many other voices, like mine, aren't heard, and to what degree this undermines any eventual perceived consensus? If that's what you think of wikitech-l, I assume it is easy to guess what you think about the talk page of Jimmy Wales, en.wp's Request for adminship and en.wp's Administrator noticeboard ? :) PPS. I'm quite sure Linus would burn NVC to the ground if he had the chance :) For those who haven't followed it and who have a bit of time on their hands: There was a very 'interesting' flamewar about being more professional in communication on the Linux kernel mailinglist last July. http://arstechnica.com/information-technology/2013/ 07/linus-torvalds-defends-his-right-to-shame-linux-kernel-developers/ If you distance yourself a bit and just read everything, you'll find that there is some basic truth to both sides of the spectrum and it basically once again sums up to: we often forget how potty trained we are, even more so that there are different styles of potty around the world and
[Wikitech-l] jCanvas module
I'm playing a little bit with HTML5 features; I see that thare's a jStorage module, but i didn't found jCanvas module. Is there some interest about? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Please use sitenotice when a new version of software is deployed
Users are very confused and worried any time a new version of wiki software is launched and tested, and some major or minor bug comes invariably out. A clear message using central sitenotice, with links to doc pages listing the changes at different levels of detail and to their talk pages to discuss them and to alert for bugs, is mandatory IMHO. Tech news are largely insufficient; evidence of work in progress should be clearly visible into all pages of interested projects. It's a basic matter of Wikilove. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
In the meantime, I tested the urlencode:...|WIKI trick, it runs perfectly for quotes, html tags as br / and links wikicode. Now it can be used both for tl|Autore and tl|Intestazione into it.wikisource, and I hope into tl|MediaWiki:Proofreadpage_index_template too. But it fails with templates; templates passed as a parameter are parsed before urlencode can do its masking job. See [[:commons:User:Alex brollo/Sandbox]] for my test, which uses an instance of a modified tl|Book (my interest is focused to Book and Creator templates). Presently my way for data recovering is a simple AJAX query but as an ecologist I'd like to save both band and server load. :-) Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
2013/1/23 Paul Selitskas p.selits...@gmail.com It definitely needs a redesign or a different approach. I believe that putting rendered view into data attributes is the worst practice ever. Data is for data, and if you want to put rendering onto client's shoulders (that is why you want these data attributes, right?), then you should not mix client- and server-site together. No, I don't at all. I only need to get clean parameters value wrapped into rendered page, so avoiding a difficult (sometimes, impossible) reverse engineering to get them, and avoiding an AJAX call to get them as-they-are. I see Infoboxes as potentially excellent records where data are impossible to read and use (for a large variety of interesting uses); and I think that's a pity and someway a resource-wasting situation. Rendering is only one of dozens of possible uses - but no data, no use of data. The whole thing is very simple and effective is infobox template code is designed from the beginning to accept clean string data without any wikicode or html code inside; but I see that very few infoboxes are designed to get such clean data and nothing other. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
2013/1/22 Paul Selitskas p.selits...@gmail.com What do you mean by any wikicode (template call, parameter, link) present into the value of infobox parameter breaks the stuff, since it is parsed and expanded by parser with unpredictable results. If your {{{author}}} doesn't have anything and it's aсceptable, then make it {{{author|}}}, or {{#if:{{{author|}}}|span .}}. Please clarify the statement above. Imagine that my infobox had a parameter author=, and imagine a clean content as this: author=Alessandro Manzoni With my template code: span data-author={{{author}}}/span I get into parsed html: span data-author=Alessandro Manzoni/span Perfect! But imagine that my template parameter is: author=[[Alessandro Manzoni]] When I pass the parameter content to span data-author={{{author}}}/span, I dont' get into html page what I'll like: span data-author=[[Alessandro Manzoni]]/span since wikicode [[Alessandro Manzoni]] will be interpreted by the server, and parsed/expanded into a html link as usual, resulting into a big mess. The same occurs for any wikicode and/or html passed into a infobox template parameter. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
2013/1/22 Paul Selitskas p.selits...@gmail.com There will be no mess. You'll just get span data-author=[[Alessandro Manzoni]]/span (did you even lift^Wtry, bro? :)), at least at Wikipedia that is what I get. If it could pass raw HTML into attributes, you'd get a huge hole for XSSploits lovers. Your're right, I used a wrong example. I got troubles from html codes, quotes and templates; not from links. Well it seems that {{urlencode:{{{1|}}}|WIKI}} solves anything. Thanks. I'll test it on our main infoboxes. I apologyze for my question (perhaps not so deep). Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
I tried to build a template which wraps template parameters into data- attributes. First results have been incouraging, then I find something logical but unexpected, crushing the whole idea. I wrote into the code of an infobox-like template something like this: span data-author={{{author}}} data-birthdate={{{birthDate}}}/span and I very happily see that html code had my data wrapped into such span tags. But I was testing my code with clean templates, t.i.: templates which have no wikicode into parameter values (as usually occurs into it.wikisource). As soon as I tested my idea into another project (Commons) I found that any wikicode (template call, parameter, link) present into the value of infobox parameter breaks the stuff, since it is parsed and expanded by parser with unpredictable results. So... I ask you again: is there any sound reason (i.e. safety related,or server loading related ) reason to avoid that HTML comments, wrapped into raw page wikicode are sent back into html rendering as-they-are? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] A layman omnipotent script
How many lines should have a script, and how many parameters are needed, to get anything from anywhere (into MediaWiki world) and to do anything with the result? Well, I'm surprised that the answer of both questions is: one. function getHtmlNew(parametri) { $.ajax(parametri.ajax).done(function (data) { parametri.callback( parametri, data); }); } Yes, parametri a pretty complex object, and callback() could be very simple or extremely complex, anyway it runs and it is a script of one line, needing one parameter only. It runs too using Wikidata API. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
2013/1/1 Paul Selitskas p.selits...@gmail.com Exactly. Nevertheless: is HTML5 already in use? If it isn't, when it will be introduced into any wiki? HTML5 was introduced into Wikipedia (and MediaWiki by default, see $wgHtml5[1]) lately. FYI, in be.wikisource data fields are used to make a link to both Belarusian Wikipedias in a link hover![2] [1] http://www.mediawiki.org/wiki/Manual:$wgHtml5 [2] http://be.wikisource.org/wiki/MediaWiki:Common.js (bottom of the code) As I told, I don't know if HTML5 is in use or not; but try to save a span id=container data-test=This a test data/span into the raw code of any page, then save it, and then use js console of Chrome from the resultin page in view mode with this: $(#container).attr(data-test) and you'll get This is a test data. This is largely sufficient (thanks again for suggestion!) :-) Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
Thanks for suggestion to dig into HTML5 data stuff; I'll study the matter. Nevertheless: is HTML5 already in use? If it isn't, when it will be introduced into any wiki? I dind't find in your answers a sound reason to strip away html comments but that they will be useless when using html5; I'm viewed as a data maniac into it.wikisource, we use any trick to manage data/metadata (cookies, microformats, labeled sections...) , and a good serialized JSON object, automatically built by js at upload of pages and wrapped into a html comment, would be very useful for our in the past with no overload of code/of servers. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
Perfect! A data- attribute can contain anything and it runbs perfectly. It can contain too a JSON-stringified object added into edit mode into a span (so that a while dictionary can be passed into a single data- attribute). It's just what I needed. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Html comments into raw wiki code: can they be wrapped into parsed html?
I'd like to use html comment into raw wiki text, to use them as effective, server-unexpensive data containers that could be read and parsed by a js script in view mode. But I see that html comment, written into raw wiki text, are stripped away by parsing routines. I can access to raw code of current page in view mode by js with a index.php or an api.php call, and I do, but this is much more server-expensive IMHO. Is there any sound reason to strip html comments away? If there is no sound reason, could such a stripping be avoided? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] A bug into pywikipedia: help needed
I found a bug into pywikipedia library, wikivoyage-family.py. Into the first class, there's a wrong line that blocks login for bots.: self.name=Wikivoyage The right line should be: self.name=wikivoyage with lower case; fixing the script into my local copy of pywikipedia library the bug disappears and login runs. Can please one of you post a bug into Bugzilla (I can't... I hate Bugzilla and I never use it :-( )? In the meantime, I'll fix the bug manually at any pywikipedia update. Thanks! Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] A bug into pywikipedia: help needed
2012/12/21 Bináris wikipo...@gmail.com Neither wikitech-l, nor Bugzilla is the right place to complain about Pywikipedia. :-) We have a separate mailing list called Pywikipedia-l ( https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l) which is recommended to join if you use Pywiki regularly, and a separate bug tracker on SF: http://sourceforge.net/tracker/?group_id=93107atid=603138. Choose one! In Pywiki community you are allowed to hate Bugzilla, git and gerrit. :-) OK! Thank you! Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] #switch limits
2012/9/24 Tim Starling tstarl...@wikimedia.org I suppose a nested switch like: {{#switch: {{{1}}} | 0 = {{#switch: {{{2}}} | 0 = zero | 1 = one }} | 1 = {{#switch: {{{2}}} | 0 = two | 1 = three }} }} might give you a performance advantage over one of the form: {{#switch: {{{1}}}{{{2}}} | 00 = zero | 01 = one | 10 = two | 11 = three }} I was thinking about something different - to split a long list into a tree of sub-templates, and to use upper templates to select the right one sub-template. This would avoid parsing of a single, heavy template, but has the disadvantage of multiple calls to much smaller templates.(one for each level); so, if basic #switch is unexpectably fast, I don't see a sound reason to add complexity to the code. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] #switch limits
I too use sometimes large switches (some hundred) and I'm far from happy about. For larger switches, I use nested switches, but I find very difficult to compare performance of nested switches (i.e.: a 1000 elements switch can be nested in three switches of 10 elements) against single global switches. I imagine that there's a performance function changing the number of switch level and number of switch elements, but I presume that it would be difficult to calculate; can someone explore the matter by tests? Another way would be, to implement a .split() function to transform a string into a list, at least; much better, to implement a JSON parsing of a JSON string, to get lists and dictionaries from strings saved into pages. I guess a dramatic improvement of performance; but I'm far from sure about. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] #switch limits
Some atomic specific page data set is needed and it's perfectly logic and predictable the creative users try any trick to forse wikicode and template code do get such a result. I appreciate deeply and I'm enthusiast about WikiData project, but I wonder about this issue: is wikidata a good data container for data sets needed from a single, specific page of a single project? I.e.: consider citations from Bible: they have a widely used structure; something like Genesis, 4:5 to point to verse 5 into chapter 4 of Genesis. A good switch can translate this reference into a link+anchor to a Page: page of a wikisource version of Bible; a different switch will translate this reference into a link+anchor pointing to ns0 version of same Bible. Can you imagine to host such a set of data into WikiData? I can't; some local data container is needed; #switch makes perfectly the job, end creative users will find this way and will use it, since it's needed to get result. Simply build something more light and efficient and simple than #switch to get the same result, and users will use it. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] AJAX sharing of Commons data/metadata
Just to give a final feedback to this talk, that has been very useful for my tries: woks are going on fastly, and are presently focused on alignement of some structures templates whose data are shared between Commons and Wikisource: Creator vs. Author; Book vs MediaWiki:Proofreadpage_index_template. Thanks again to list members. :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] AJAX sharing of Commons data/metadata
2012/8/30 Brion Vibber br...@pobox.com Luckily, if you're using jQuery much of the low-level stuff can be taken care of for you. Something like this should work for API calls, automatically using the callback behind the scenes: Thanks! really I tested some interproject AJAX API call with no luck; but I never tested a similar syntax. Can I ask for something more help using your email, if needed (and I guess, it will...)? Obviuosly I'll share into this thread any good result. Unluckily, I found that field of Information and Book templates are well, clearly tagged with informative ids, while Creator fields are not. I posted a proposal there, but I know that editing complex templates is a hard job. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] AJAX sharing of Commons data/metadata
Thanks again Brion, it runs perfectly and - strange to say - I got no hard difficulty, just a little bit of review of API calls and of structure of resulting formidable objects. It's really much simpler to parse original template contents than resulting html from their expansion ;-) and AJAX API calls are SO fast! Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] AJAX sharing of Commons data/metadata
As you know, wikisource needs robust, well-defined data, and there's a strict, deep relationship between wikisource and Commons since Commons hosts images of books, in .djvu or .pdf files. Commons shares both images and contents fo information page of images, so that any wiki project can visualize a view-only pseudo-page accessing to a local page named as the file name into Commons. Working into self-made data semantization into it.wikisouce using a lot of creative tricks, we discovered that it's hard/almost impossible to read by AJAX calls the contents of pages of other projects since well-known same origin policy, but that File: local pages are considered as coming from same origin so that they can be read as any other local page, and this AJAX call asking for the content of i.e. File:Die_Judenfrage_in_Deutchland_1936.djvu: html=$.ajax({url: http://wikisource.org/wiki/File:Die_Judenfrage_in_Deutchland_1936.djvu ,async:false}).responseText; gives back the html text of local File: view-only page, and this means that any data stored into information page into Commons is freely accessible by a javascript script and can easily used locally. In particular, data stored into information and/or (much better) Book and Creator templates can be retrieved and parsed Has this been described/used before? It seems a plain, simple way to share and disseminate good, consistent metadata into any project; and this runs from today, without any change on current wiki software. If you like, I'm sharing a practical test use of this trick into wikisource.org too, you can import User:Alex brollo/Library.js and a lot of smallo, original scripts will be loaded; click on metadata botton from any page connected to a File: page ( namespaces Index, Page) and you'll see a result coming from such an AJAX call. Alex brollo, from it.wikisource ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] AJAX sharing of Commons data/metadata
Thanks for comments. Relationship between wikisource and Commons is very strict, and there's a large 1:1 match between structured wikisource data stored into well-formed templates (used into nsIndex and ns0) and Book template; there's too a 1:1 relationship between nsCreator into Commons and nsAuthor into wikisource. Djvu and (less used) pdf files are already shared among different wikisource projects, but data stored into information page of files are not shared, so that any project rewrites them and stores them with a variety of formats and contents, introducing redundance and mining deeply coherence. So, we are going to use Commons metadata - already stored into Book and Creator templates - to share them widely anong all projects that need them. When metadata are uploaded and parsed they can be used to feed local wikisource templates and/or to align data with automated procedures. Thanks for API suggestion, but the question is: does it violates same origin AJAX policy? I can read anything by a bot from any project, but AJAX is great to enhance interactivity and to help user just when user needs data, i.e. in edit mode. Yes, the solution will be CORS, but I can't wait for future enhancements when data can be accessed and used today, with present software. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] AJAX sharing of Commons data/metadata
2012/8/29 bawolff bawolff...@gmail.com On Wed, Aug 29, 2012 at 2:24 PM, Alex Brollo alex.bro...@gmail.com wrote: Thanks for comments. [..] Thanks for API suggestion, but the question is: does it violates same origin AJAX policy? I can read anything by a bot from any project, but AJAX is great to enhance interactivity and to help user just when user needs data, i.e. in edit mode. No it doesn't violate the same origin policy. Same origin policy only prevents reading information from other websites, it does not stop you from executing content from other websites (Which always seemed an odd distinction to me...). Thus you can use the api with a callback parameter to get around the same origin policy. Obviously CORS is a much nicer solution. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] AJAX sharing of Commons data/metadata
No it doesn't violate the same origin policy. Same origin policy only prevents reading information from other websites, it does not stop you from executing content from other websites (Which always seemed an odd distinction to me...). Thus you can use the api with a callback parameter to get around the same origin policy. Ouch this is a little bit above my skill understanding (really I discovered AJAX not far ago). . Where can I find some examples of API inter-project data exchage wth callback parameter? I.e: I'd like to get the raw content and the html rendering of File:I_promessi_sposi_(1840).djvu from wikisource, with an AJAX callback call. Which is the needed code? :-( Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikimedians are rightfully wary
I suppose, that a way could be a warning, into centralSiteNotice or into another similar space, optionally shown by a gadget/a Preferences set (default=disabled) into any page of any wiki. This warning should be brief, informative and focused on possible unespected results by software changes. Normal users shuold not view anything; advanced users (sysops and layman programmers) will surely appreciate it a lot. I remember terrible headaches trying to fix unexpented, intriguing local bugs of out rich javascript set of local tools into it.source. Alex brollo 2012/8/24 Strainu strain...@gmail.com 2012/8/24 Ryan Lane rlan...@gmail.com: Your idea is a great one, except... I was going to say you can't see the forest for the trees, but actually it's the other way around. I think you're too focused on the big picture (communicating with the community) to see that smaller steps can help a great deal. I haven't seen any small step solution that improves the situation, though. Unless there's two way communication then it's the WMF telling people here's what we're going to do without any way for them to give us proper feedback. We can't possibly host discussions with all of our communities, and it's really unfair to only select the biggest ones. That's exactly what I'm trying to point out to you: the WMF telling people here's what we're going to do *on their home wiki* IS a huge improvement. Specifically, on ro.wp, instead of 4-5 people seeing these messages, 50+ people would see the messages on the Village Pump. That's a ten-fold increase in coverage with very little effort. Sure, it's great to have lots of peopled involved in the discussion leading to a big change, but it's not bad at all to have some people involved in the decision making, but _everybody_ in the loop about the decision taken. Think of it as law-making: some people gather, discuss and take a decision, which is then made public for all interested parties before it comes into force. I really feel that the blog is the best place for announcements like this. How many people read the blog? How many people combined read the village pumps of the 10 biggest wikipedias? There's a number of decent ways to notify the community of changes. The blog is likely the easiest route for that. No, it isn't. The blog simply does not have enough reach and very likely will never have enough reach no matter what you do to make it popular. I could find tens of other reasons why it's not the best method, but I'll stick to just one: bog posts are at least 2-3 times longer than messages on village pumps. This means 3 times more time to translate. I think the author of the original article said it best: Agreement aside, we're seeing a disconnect right now between what the Foundation is spending resources on and the issues faced by the community. If we can't agree on the problem, we will have a very hard time finding solutions. Strainu ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Full support for djvu files
Djvu files are the wikisource standard supporting proofreading. They have very interesting features, being fully open in structure and layering, and allowing a fast and effective sharing into the web, when they are stored in their indirect mode. Most interesting, their text layer - which can be easily extracted - contains both the mapped text from OCR and metadata. A free library - divuLibre - allows full command line access to any file content. Presently, djvu files structure and features are minimally used. Indirect mode is IMHO not supported at all, there's no mean to access to mapped text layer nor to metadata, and only the full text can be accessed once, when creating a new page into Page namespace. It would be great IMHO: * to support indirect mode as the standard; * to allow free, easy access to the full text layer content from wikisource user interface. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Full support for djvu file
Text layer is stored in img_metadata, which means it can be retrieved by the API (using ?action=queryprop=imageinfoiiprop=metadata). However when I tried to test this, it didn't seem to work. Maybe trying to return the entire text layer hit some max api result size limit or something. (It'd be really nice if we had some nicer place to store information about files, especially for huge things like the text layer which we don't generally want to load the entire thing all the time. There's a bug about that somewhere in bugzilla land). Indirect mode (From what I can find out from google) is when you have an index djvu file that has links to all the pages making up the djvu file, so you can start viewing immediately and pages are only downloaded as needed. I'm not sure how such a format would work in terms of uploading it. Unless we convert it on the server side, how would we upload all the constitutiant files (I suppose we could tell people to upload tarballs. Then we have to make sure to validate the contents, and communicate to people that the tarball is only for uploaded djvu files). [Of course until 5 minutes ago I'd never heard of an indirect djvu file, so I could be misunderstanding] -bawolff I use a lot djvuLibre library on my pc, both from console and from python scripts; so I can tell you that it will be very simple to convert a bundled djvu file into an indirect file. Obviously this should be transparent for uploader, being a server fully automatic job. About text layer: it's very, very interesting even if complex. There are command-line DjvuLibre routines to do anything you want, both to read and to edit it. What we get is simply the most banal output (full text); from any IA djvu file you can get much more, t.i. gerarchic text structure (al page, column, region paragraph, line, and single word detail) with coordinates of any element at any detail level; but you can get/insert too structured metadata, both as global metadata and page-specific metadata. Any djvu extraction/editing function runs both on bundled and on indirect djvu file, and obviosuly any read/edit is much faster when a small, single-page file is addressed. Coordinates of text elements and gerarchic structure of text are extremely interesting, since such set of data could be used to guess formatting: ie you could guess centered text, tables, sections alignment, headers/footers, poems, paragraphs, and font-sizes too. Inter line spacing could be used to guess chapter titles. Empty text areas are often simply areas covered by illustrations, so that an intelligent algorithm could guess their size and position. I imagine that thumbnail generation/purging too would be much more effective and fast. In brief, we have a Ferrari but are using it with a speed limit of 10 miles/hour. :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Cutting MediaWiki loose from wikitext
I can't understand details of this talk, but if you like take a look to the raw code of any ns0 page into it.wikisource and consider that area dati is removed from wikitext as soon as an user opens the page in edit mode, and re-builded as the user saves it; or take a look here: http://it.wikisource.org/wiki/MediaWiki:Variabili.js where date used into automation/help of edit are collected as js objects. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Cutting MediaWiki loose from wikitext
I agree that's hyronical to play with a powerful database-built project, and to have no access nor encouragement to organize our data as should be organized. But - we do use normal pages as data repository too, simply marking some specific areas of pages as data areas. More, we use the same page both as normal wikitext container and data container. Why not? Alex brollo (it.source) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Announcing a new extension - SideBarMenu,
Thank you, we did, but we wrapped the needed jQuery code into a gadget, so that users can set it as a personal preference. Our gadgets are growinig and growing in number and performance! :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Announcing a new extension - SideBarMenu,
2012/3/8 Kim Eik k...@heldig.org By fixed do you mean the css style position: fixed; ? Yes. An absolutely simple, but effective idea. Really all tools and buttons should have a position:fixed css attribute - particularly when proofreading into wikisource. Recently I registered too into Distributed Proofreaders, they have a specialyzed proofreading interface (while wikisource has in intelligent customization of a general wiki interface) and I saw that once more we rediscovered the wheel, giving to some tools the position:fixed attribute, just as DP does. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Announcing a new extension - SideBarMenu,
2012/3/8 Kim Eik k...@heldig.org The SideBarMenu simplifies creating multilevel menues. http://www.mediawiki.org/wiki/Extension:SideBarMenu This is my first development project including mediawiki and has been by request of a large oil company in Norway; Statoil. Any feedback of any kind, is appreciated. Very good! Thanks! Wikisource needs lots of gadgets/tools, youe extension could be inspiring and very useful when there's any need to add lots of tools-links into SideBarMenu. We found to that sidebar menu is much more comfortable if it is fixed so that it doesn't scroll any more when scrolling long texts in edit mode. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Wiki markup vs. html tags
I'd like to replace usual wiki markup ''...'' and '''...''' for italic and bold with well-formed - even if deprecated - html tags i.../i and b.../b. Is there any serious issue dealing with server load, or safety/compatibility/other? And - generally speaking - is there any project to convert wiki markup into a well-formed one? I presume that there're lots of talks about, I'd like to be simply addessed to the best ones from them. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wiki markup vs. html tags
Thanks Platonides - it's rewarding to find that I'm not so crazy. :-) I'll subscribe wikitext-l, I saw a recent, incouraging contribution about wiki markup just my dream too. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] About page ID
Is there a sound reason to hidden so well the main id of pages? Is there any drawback to show it anywhere into wikies, and to use it much largely for links and API calls? Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] About page ID
Thanks! I'll save this talk as a reference. Really my question was focused on possibile risks or safety issues; it's so strange that a database (since I see wikisource as a database) is mainly indexed, from the user's point of view, on a variable field as title of the page, that I suspected some serious safety issue. And - strangely enough - id is not shown at all into pages, nor there's any magic word or other user friendly method to get it. From a practical point of view, many from wikisource page titles are very long, often they use non-ascii characters and mixtures of capitalized and not capitalized characters, they can't be used as they are as local file names... in brief, I feel all this stuff almost as annoying as the apostophes used in wiki markup for bold and italic. ;-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Some questions about #switch
Thanks for contributions. Really I was going to use large switches, both as associative arrays and as sets. I hoped, that algorithm was based simply on string search into the code (I presume, it is possible, and I know how efficient is plain string search in any decent language) but I guess, from your replies, that I was wrong. :-( I'll use strictly the 2000 items limit; I presume that - in occasional cases - I will be forced to split larger switches into different containers, after some string manipulation with padleft trick. Well, it's hyronical that wiki is based on a powerful database, while user interface forces contributors to use any kind of dirty trick to simulate basic, extremely simple database-like jobs . :-( Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Some questions about #switch
I'm using more and more #switch into templates, it's surprising how many issues it can solve, and how much large arrays it can manage. My questions are: 1. Is there a reasonable upper limit for the number of #switch options? 2. Is there a difference in server load between a long list of #switch options and a tree of options t.i. something as a nested #switch? 3. Is there any drawback into a large use of #swtch-based templates? I presume that my questions are far from original, please give me a link to previous talks or help/doc pages if any. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Help us test the VisualEditor prototype
I'm far from being enough skilled to understand the whole stuff. Working as hardly as I can into wikisource, I found that the most useful tools are js scripts like RegexMenuFramework by Pathoschild, t.i. a container for personal, highly customizable js scripts to work on wikitext (fixing scannos, formatting, adding specific templates); there's no need of so much skill to build new simple, effective tools by a js beginner as I am; this is simple dealing with a plain text into text box of a form, I guess it would not be so simple into the highly structured html of Visual editor. Is there some hope about keeping js editing of wikitext/source text into Visual Editor as simple as it is presently? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Mediawiki and a standard template library
It could frequently come out a big problem of collisions with existing templates with different code. I feel that synonimous templates, with different features, are a very subtle and harmful touble. This raises the long-standing problem of redundancy and coherence so deeply afflicting anything into wiki contents, but extension tags. It's impossible to leave to the willingness of bold, sometimes unexperienced users, the hard task to align templates. Exactly the same, for experienced users. But perhaps, a SharedTemplatesExtension could be built, ensuring a common repository of template routines, matched with common names. I can't imagine how this can be done. :-( Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Mediawiki and a standard template library
IMHO, the idea of a decent set of common, shared, english-named, standard templates is really a a good one. Noone have nothing against the fact that reserved words of any programming language is english-named and standard. I presume, that many of existing templates aren't used into ns0 or other content namespaces; in my opinion, the core shared templates should be focused on common content namespaces troubles. Please consider too that many templates require some peculiar css settings too; perhaps a common, shared ccs for tags used in content namespaces would be both more simple to do, and very useful so build a shared group of templates. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Reproducing localurl by python urllib.quote()
2011/6/29 Ashar Voultoiz hashar+...@free.fr On 27/06/11 23:14, Platonides wrote: See wfUrlEncode in GlobalFunctions.php I have added tests for this function with r91108. Feel free to propose additional tests :) http://www.mediawiki.org/wiki/Special:Code/MediaWiki/91108 -- Ashar Voultoiz ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Reproducing localurl by python urllib.quote()
I tested too mediaWiki encoding with js encodeURI() function. I found a difference, only one, since mediaWiki encodes apostrophe while encodeURI() doesn't. Obviously the second, great difference is the conversion of spaces into underlines. So, in js so far I got a good simulation of localurl: Mediawiki encoding with this: encodeURI(string.replace(/ /g,_)).replace(/%27/g,') ) (the last string is an apostrophe between double quotes) Is there some reason why wiki encoding is different from js encodeURI (that, I presume, is founded on a standard)? I found this a little confusing. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Reproducing localurl by python urllib.quote()
2011/6/27 Platonides platoni...@gmail.com The relevant function is Title::getLocalURL() I think that in your quote function you need to skip '@$*(),' as well, and .-_ wouldn't be needed there (but urllib.quote could differ from php urlencode). See wfUrlEncode in GlobalFunctions.php Thanks Platonidea, I'll test localurl for your character list. In the meantime, I found that my one was a rather silly question since any html link to wiki pages from mediawiki software has this kind of code: a href=/wiki/Portal:Children%27s_literature title=Portal:Children's literaturePortal:Children's literature/a. t.i., there is both the escaped version of the title of the page into href attribute, and the not-escaped version of the name of the page, inside the title attribute; the latter being mush simpler as a key to select links. :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Reproducing localurl by python urllib.quote()
I found a number of troubles when trying to replicate localurl: parser function by python urllib.quote(), with a number of annoying bugs. Is this correct-complete? /wiki/+urllib.quote(name-of-page.replace( ,_),/!,;:.-_) where name-of-page is utf-8 encoded. Thanks! I apologyze for so a banal question. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] New window for external link
2011/4/21 K. Peachey p858sn...@gmail.com http://www.mediawiki.org/wiki/Manual:$wgExternalLinkTarget can be done to effect all outbound links. Here on the outdated page about it, gives a little bit of information about why people really dislike it when you do that: http://www.mediawiki.org/w/index.php?title=Manual:Opening_external_links_in_a_new_windowoldid=347475#How_to_open_a_link_in_a_new_window_in_Stricter_HTML_4 Could this option be added to user preferences? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] A question about templates parsing and caching
2011/4/11 Daniel Friesen li...@nadir-seen-fire.com Side thought... why a #switch library? What happened to the old {{Foo/{{{1}}}|...}} trick? Simply, {{Foo/{{{1}}}|...}} links to different pages, while {{Foo|{{{1}}}|...}} points to the same page. I had been frustrated when I tried to use Labeled Section Transclusion to build template libraries :-), that would be an excellent way to build collection of objects into a wiki page, both of methods and attributes... ... but #lst doesn't parse raw wiki code from scratch. If it would (t.i.: if #lst would read wiki code as it is, before any parsing of it, ignoring at all the code outside labelled section: t.i. ignoring noinclude, html comment tags... anything) interesting scenarios would raise. But, if there's no performance gain with {{Foo|{{{1}}}|...}} trick, I'll use {{Foo/{{{1}}}|...}} for sure. KISS is always a good guide line. :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] A question about templates parsing and caching
2011/4/11 Daniel Friesen li...@nadir-seen-fire.com Though, when we're talking about stuff this complex... that line about using a REAL programming language comes into play... Would be nice if there was some implemented-in-php language script language we could use that would work on any wiki. I had a project playing around with that idea but it's dead. ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] Are you a wikisource contributor? If you are, I guess that you considered too this syntax, pointing to what I wrote into my last message: {{#section:Foo|{{{1} but... it refuses to run, if section {{{1}}} of page Foo is a method (while its runs obviuosly if section is an attribute). :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Aquestion about templates parsing and caching
2011/4/11 Andrew Garrett agarr...@wikimedia.org On Mon, Apr 11, 2011 at 5:59 AM, Roan Kattouw roan.katt...@gmail.com wrote: What we store in memcached is a serialized version of the preprocessor XML tree, keyed on the MD5 hash of the wikitext input, unless it's too small, like Platonides said. This means that if the exact same input is fed to the preprocessor twice, it will do part of the work only one and cache the intermediate result. Yes, I implemented this with Tim's help to try to cut down on the CPU load caused by lots of Cite templates, IIRC. If I recall correctly, the performance benefit was not particularly substantial. Ok, coming bac to my idea of building small libraries of work-specific templates into a unique template doesn't seems a particularly brilliant one; something that can be done only if templates merged into one are simple, and few, and only for contributor's comfort, if any. Thanks for your interest! Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Aquestion about templates parsing and caching
I'd like to know something more about template parsing/caching for performance issues. My question is: when a template is called, it's wikicode, I suppose, is parsed and translated into something running - I can't imagine what precisely, but I don't care so much about (so far :-) ). If a second call comes to the server for the same template, but with different parameters, the template is parsed again from scratch or something from previous parsing is used again, so saving a little bit of server load? If the reply is yes, t.i. if the running code of the whole template is somehow saved and cached, ready to be used again with new parameters, perhaps it could be a good idea to build templates as librares of different templates, using the name of the template as a library name and a parameter as the name of specific function; a simple #switch could be used to use the appropriate code of that specific function. On the contrary, if nothing is saved, there would be good reasons to keep the template code as simple as possible, and this idea of libraries would be a bad one. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Core html of a wikisource page
I saved the HTML source of a typical Page: page from it.source, the resulting txt file having ~ 28 kBy; then I saved the core html only, t.i. the content of div class=pagetext, and this file have 2.1 kBy; so there's a more than tenfold ratio between container and real content. I there a trick to download the core html only? And, most important: could this save a little bit of server load/bandwidth? I humbly think that core html alone could be useful as a means to obtain a well formed page content, and that this could be useful to obtain derived formats of the page (i.e. ePub). Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Core html of a wikisource page
2011/4/6 Daniel Kinzler dan...@brightbyte.de On 06.04.2011 09:15, Alex Brollo wrote: I saved the HTML source of a typical Page: page from it.source, the resulting txt file having ~ 28 kBy; then I saved the core html only, t.i. the content of div class=pagetext, and this file have 2.1 kBy; so there's a more than tenfold ratio between container and real content. wow, really? that seems a lot... I there a trick to download the core html only? there are two ways: a) the old style render action, like this: http://en.wikipedia.org/wiki/Foo?action=render b) the api parse action, like this: http://en.wikipedia.org/w/api.php?action=parsepage=Fooredirects=1format=xml To learn more about the web API, have a look at http://www.mediawiki.org/wiki/API Thanks Daniel, API stuff is a little hard for me: the more I study, the less I edit. :-) Just to have a try, I called the same page, render action gives a file of ~ 3.4 kBy, api action a file of ~ 5.6 kBy. Obviuosly I'm thinking to bot download. You are suggesting that it would be a good idea to use a *unlogged * bot to avoid page parsing, and to catch the page code from some cache? I know that some thousands of calls are nothing for wiki servers, but... I always try to get a good performance, even from the most banal template. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Core html of a wikisource page
2011/4/6 Daniel Kinzler dan...@brightbyte.de I know that some thousands of calls are nothing for wiki servers, but... I always try to get a good performance, even from the most banal template. That'S always a good idea :) -- daniel Thanks Daniel. So, my edits will drop again. I'll put myself into study mode :-D Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Future: Love for the sister projects!
Very interesting! :-) I'm a beginner in js stuff, but I realized how much js can be useful in a number of ways (presently I am a very simple gadgets maker t help users engaged into proofreading into wikisource). I'm a little worried about additional server load coming from AJAX, and I suggest different policies between local js scripts (that, fix me if I'm wrong, don't imply any server load) and AJAX scripts, which raise performance issues. So, I can't wait for a js wiki but please, keep it simple! So simple that a beginner too could browse it,learn from it and publish anything without any embarrassment. Beginners often feel themselves really stupid and some good idea could be lost. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Topic and cathegory analyser
2011/3/4 Paul Houle p...@ontology2.com Briefly, atthe border of OT: I see the magic word ontology into your mail address. :-) :-) I discovered ontology ... well, a long history. Ontological classification is used to collect data on cancer by National Cancer Insititute; and, strange to tell, I discovered it as an unexpected result of posting a picture on Commons, a low grade prostatic PIN... then I found that NCI use SemanticWiki. In other terms: from wiki, to wiki again. :-) My aim about ontologies is very, very simpler; it's simply to create something I called catwords, t.i. a system of categorization (wiki sistem is perfect) that can be used too as a list of keywords. I can't wait for installation of DynamicPageList into it.source, since the engine I need is simply a good method to get intersection of categories; but I found that it's not sufficient, some peculiar conventions in categorization are needed too, far from complex well, I'll tell you news as soon as I will get my tool. :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Gender preference
There's a parallel talk into it chapter about gender gap. This gap is part of a larger gap involving software development in general; there are few programmer women. I presume, that this highlights a similarity between wiki and a software development environment; and really many from most enthusiast, and productive contributors are too programmers. But... as you know, the profile of a programmer is far from the profile of a woman; consider the famous statement The three chief virtues of a programmer are: Laziness, Impatience and Hubris.. Then consider the pedia statement Be bold!, that is a gentler way to name hubris. :-) Is gender gap so mysterious? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Captchas and non-English speakers
2011/2/5 David Gerard dger...@gmail.com (It's a real pity reCaptcha is third-party and proprietary.) Well, we it.source fellow are writing our communication about (it will be published into wikisource-l), but a brief mention to good news is mandatory here. We have a simple script that extracts word images, corresponding to doubtful OCR interpretation, from any djvu file with a text layer; scripts to upload into djvu layer again fixed words are simple too. We posted first communication into John Vandenberg en.source user page, and a wikicaptcha is now something possible. See John's lalk here: http://en.wikisource.org/wiki/User_talk:John_Vandenberg#reCAPTCHA_for_source Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Captchas and non-English speakers
2011/2/5 Marcus Buck w...@marcusbuck.org Instead of captchas like shipsneeds we of course need words in the local language. It shouldn't be hard to do some statistical analysis of existing articles on the wiki and to collect a sample of common words of limited length that can be combined to form local captchas. (I guess the above-mentioned script drop-down should be a script/language combination drop-down then.) Just to let you know that Aubrey just prestented it.source idea for wikicaptcha into wikisource-l :-) Obviously, if a wikicaptcha tool will be built and will run, we can do anything and while interpreting words (in any language) any user will contribute to source transcriptions in a very valuable way. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Captchas and non-English speakers
2011/2/5 River Tarnell r.tarn...@ieee.org In article AANLkTikWLU5Y8C2UokYRN=v1-zwhb1kthnxi4xtbm...@mail.gmail.com, David Gerard dger...@gmail.com wrote: On 5 February 2011 15:12, Alex Brollo alex.bro...@gmail.com wrote: Just to let you know that Aubrey just prestented it.source idea for wikicaptcha into wikisource-l What would it take to get this into place? What's the captcha load on WMF sites? Would e.g. the toolserver melt under the load? Perhaps on one project at a time? I don't think this should be hosted on the Toolserver; as CAPTCHAs are a core part of the site, they should not rely on the TS to work. - river. IMHO, it could be an opportunity to think again to the role of Commons as a central library. I imagine something like this: 1. as soon as a djvu file with a text layer is uploaded, a complete set of pages text layers is extracted, saving words coordinates too; 2. such text layers could be browsed by a script, extracting all words marked as doubtful (usually with a ^ characters), but extracting too words which don't match with a good dictionary; 3. a dynamic recaptcha database is updated and word images are submitted to wiki contributors, both as a formal captcha for unlogged user edits, and as a volunteer job to help wikisource projects; updates will fix text files; 4. a tool should be build, to upload pure text from such text files into any wikisource project; 5. finally refined text could be re-uploaded into djvu file, so converting it into a djvu file with a wiki text layer. Alex 4. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Simple Page Object model using #lst
I'd like to share an idea. If you think that I don't know of what I am speaking of, probably you're right; nevertheless I'll try. Labeled section trasclusion, I presume, simply runs as a substring search into raw wiki code of a page; it gives back a piece of the page as it is (but removing any section... tag inside). Imagine that this copy and paste of chunks of wiki code would be the first parsing step, the result being a new wiki text, then parsed for template code and other wiki code. If this would happen, I imagine that the original page could be considered an object, t.i. a collection of attributes (fragments of text) and methods (template chunks). So, you could write template pages with collections of different template functions,. or pages with collections of different data, or mixed pages with both data and functions, any of them being accessible from any wiki page of the same project (while waiting for interwiki transclusionn). Then, simply adding carefully a self-tranclusion permission to use chunks of code of a page into the same page , the conversion of a page into a true,even if simple, object would be complete. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Simple Page Object model using #lst
2011/1/25 Jesse (Pathoschild) pathosch...@gmail.com On Tue, Jan 25, 2011 at 8:14 AM, Alex Brollo alex.bro...@gmail.com wrote: If this would happen, I imagine that the original page could be considered an object, t.i. a collection of attributes (fragments of text) and methods (template chunks). Labeled Section Transclusion can be used this way, but it's not very efficient for this. Internally it uses generated regular expressions to extract sections; you can peek at its source code at http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/LabeledSectionTransclusion/lst.php?view=markup . Thanks, but I'm far from understanding such a php code, nor I have any idea about the whole exotic thing of wiki code parsing and html generation. But, if I'd write something like #lst, I'd index text using section code simply as delimiters, building something hidden like this into the wiki code ot into another field of database: !-- sections s1[0:100] s2 [120:20] s3[200:150] -- where s1,s2,s3 are the section names and numbers the offset/length of the text between section tags into the wiki page string; or something similar to this, built to be extremely simple/fast to parse and to give back substrings of the page in the fastest, most efficient way. Such data should be calculated only when a page content is changed. I guess, that efficiency of sections would increase a lot, incouraging a larger use of #lst. If such parsing of section text would be the first step of page parsing, even segments of text delimited by noinclude tags could be retrieved. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Simple Page Object model using #lst
2011/1/25 Alex Brollo alex.bro...@gmail.com Just to test effectiveness of such a strange idea, I added some formal section tags into a 6 Kby text section.txt, then I wrote a simple script to create a data area , this is the result (a python dictionary into a html comment code) appended to the section.txt file: !--SECTIONS:{'section begin=1 /': [(152, 990), (1282, 2406), (4078, 4478)], 'section begin=6 /': [(19, 115)], 'section begin=2 /': [(2443, 2821), (2859, 3256)], 'section begin=4 /': [(1555, 1901)], 'section begin=5 /': [(171, 477)], 'section begin=3 /': [(3704, 4042)]}-- then I run these lines from python idle: for i in range(1000): f=open(section.txt).read() indici=eval(find_stringa(f,!--SECTIONS:,--)) t= for i in indici[section begin=1 /]: t+=f[i[0]:i[1]] As you see the code, for 1000 times: opens the file and loads it selects data area (find_stringa is a personal, string seach tool to get strings), and converts it into a dictionary retrieves all the text inside multiple sections named 1 (the worst case in the list: section 1 has three instances: [(152, 990), (1282, 2406), (4078, 4478)] Time to do 1000 cicles: more or less, 3 seconds on a far from powerful pc. :-) Fast, in my opinion! So, it can be done, and it runs, in an effective way too. Doesn't it? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
The interest of wikisource project for a formal and standardyzed set of book metadata (I presume from Dublin Core) into a database table is obviuos. Some preliminary tests into it.source suggest that templates and Labeled Section Transclusion extension could have a role as existing wikitext conteiners for semantized variables; the latter perhaps more interesting than the former one, since their content can be accessed directly from any page I'd like that book metadata would be considered from the beginning of this interesting project. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] MATH markup question
2011/1/19 Maury Markowitz maury.markow...@gmail.com I am dipping my toe in MATH for the first time and finding the results somewhat curious. They key appears to be this statement: It generates either PNG images or simple HTML markup, depending on user preferences and the complexity of the expression. Consider the formulas here: http://en.wikipedia.org/wiki/Spherical_tokamak Is there any hope that the PNG and HTML versions of things might be made to look more similar? The characters don't even look the same in some cases (kappa for instance) and this leads to VERY confusing output. You can force png rendering both by preferences and by code. But what's more interesting is the use of badly documented \scriptstyle TeX tag, which generates a much smaller and less invasive display of pngs: se this recent talk into en.source: http://en.wikisource.org/wiki/Wikisource:Scriptorium#Help.21_.28fractions_and_TeX_formatting.29 Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] MATH markup question
2011/1/19 Alex Brollo alex.bro...@gmail.com 2011/1/19 Maury Markowitz maury.markow...@gmail.com I am dipping my toe in MATH for the first time and finding the results somewhat curious. They key appears to be this statement: It generates either PNG images or simple HTML markup, depending on user preferences and the complexity of the expression. Consider the formulas here: http://en.wikipedia.org/wiki/Spherical_tokamak Is there any hope that the PNG and HTML versions of things might be made to look more similar? The characters don't even look the same in some cases (kappa for instance) and this leads to VERY confusing output. You can force png rendering both by preferences and by code. But what's more interesting is the use of badly documented \scriptstyle TeX tag, which generates a much smaller and less invasive display of pngs: se this recent talk into en.source: http://en.wikisource.org/wiki/Wikisource:Scriptorium#Help.21_.28fractions_and_TeX_formatting.29 Alex brollo I added some \scriptstyle tags into inline expressions, and I discovered that they force png too, while giving a smaller (better in my opinion) display of tha formulae. feel free to rollback! It's a test only. Alex brollo ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] MATH markup question
2011/1/19 Maury Markowitz maury.markow...@gmail.com Wow, thanks for the pointer Carl, MathJax is impressive. Alex, your work is appreciated, but I'm not sure exactly what I'm seeing. Can you point me in the right direction to read up a bit more? Don't care, throw away my suggestions and ask Carl. Ubi major minor cessat! :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] From page history to sentence history
It seems a complely different topic, but: is there something to learn about text saving from the smart trick of TeX formulas storing? I did a little bit of reverse engineering on that algorithm, I did never find anything useful application from it, but much fun. :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
Just to give an example: i wrote a different algorithm for [[en:s:template;Loop]], naming it [[en:s:template;Loop!]] and I asked for 100 and 101 dots with them into an empty sandbox preview. These are results: Sandbox, empty, preview: Preprocessor node count: 35/100 Post-expand include size: 1858/2048000 bytes Template argument size: 450/2048000 bytes Expensive parser function count: 0/500 Sandbox, 2 calls to loop to print 100 and 101 dots, preview Preprocessor node count: 1045/100 Post-expand include size: 2260/2048000 bytes Template argument size: 1551/2048000 bytes Expensive parser function count: 0/500 Sandbox, 2 calls to loop! to print the same dots, preview Preprocessor node count: 193/100 Post-expand include size: 2300/2048000 bytes Template argument size: 680/2048000 bytes Expensive parser function count: 0/500 Is there really no useful feedback from these data? Really there's no correlation with server load? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
2011/1/14 Tim Starling tstarl...@wikimedia.org However, I'm not sure how that obtained that result, since {{loop!|100|x}} just expands to {{loop|100|x}}, since it hits the default case of the #switch. When I try it, I get a preprocessor node count of 1069, not 193. :-) The {{loop!|100|x}} is deprecated; loop! only manages by itself numbers between 1 and 10, larger numbers as 100 should be obtained by nesting or appending, as suggested into the doc of the template.For back-compatibility, {{loop!|100|x}} simply calls {{loop|100|x}} instead of raising an error. So, it's not surprising that optimization is only met into the range 1..10. Preprocessor count 193 comes from suggested syntax for 100 (1 nesting) + 101 (1 nesting and 1 appending): {{loop!|10|{{loop!|10|x {{loop!|10|{{loop!|10|x{{loop!|1|x}} This call is running into en:s:Wikisource:Sandbox now, and I got same metrics from that page's html: !-- NewPP limit report Preprocessor node count: 193/100 Post-expand include size: 2300/2048000 bytes Template argument size: 680/2048000 bytes Expensive parser function count: 0/500 -- Nevertheless, loop is mainly an example and a test, not so a useful template. Dealing with the trouble of main metadata consistency into wikisource, deeply undermined by redundancy, our way to fix things really produces higher metrics when compared with other projects results but now I have a small toolbox of tricks to evaluate such a difference (rendering time + existing metrics). Obviously it would be great to have better metrics for good, consistent perfomance comparison; but, as I told, I don't want to overload servers just to produce new metrics to evaluate server overload. ;-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
2011/1/12 Platonides platoni...@gmail.com MZMcBride wrote: Doesn't it make much more sense to fix the underlying problem instead? Users shouldn't have to be concerned with the number of #ifexists on a page. MZMcBride Ok, now I feel myself much more comfortable. These my conclusions: # I can feel myself free to test anything even if exotic. # I will pay attention to html rendering time when trying something exotic. # In the remote case that I really build something server-expensive, and such exotic thing infects largely wiki projects (a very remote case!), some sysop would see bad results of a bad idea and: ## will fix the parser code, if the idea is good, but the software manages it with a low level of efficience; ## will kill the idea, if the idea is server expensive and simply unuseful or wrong. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
2011/1/11 Tim Starling tstarl...@wikimedia.org On 07/01/11 07:50, Aryeh Gregor wrote: On Wed, Jan 5, 2011 at 8:07 PM, Alex Brollo alex.bro...@gmail.com wrote: Browsing the html code of source pages, I found this statement into a html comment: *Expensive parser function count: 0/500* I think the maximum was set to 100 initially, and raised to 500 due to user complaints. I'd be completely happy if users fixed all the templates that caused pages to use more than 100, then we could put the limit back down. Thanks Tim. So, implementing a simple js to show that value (and the other three data too) in small characters and into a border of the page into the page display is not completely fuzzy. As I told I hate to waste resources - any kind of them. It's a pity that those data are not saved into the xml dump. But I don't want to overload the servers just to get data about servers overloading. :-) Just another question about resources. I can get the same result with an AJAX call or with a #lst (labeled section transclusion) call. Which one is lighter for servers in your opinion? Or - are they they more or less similar? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
2011/1/11 Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com Overall, I'd advise you to do whatever minimizes user-visible latency. That directly improves things for your users, and is a decent proxy for server resource use. So use whichever method takes less time to fully render. This is rather more practical than trying to consult MediaWiki developers about every detail of your program's implementation, which is unlikely to be used widely enough to greatly affect server load anyway, and even if it were we couldn't necessarily give intelligent answers without knowing exactly what the program is doing and why. I'm already using your suggestion, today I replaced a complex test template from our village pump (replacing it with a link to a subpage, visited by interested users only) and, really, the difference in rendering the page village pump page was obviuos. Probably the best is, to use any trick together, paying much more attention to widely used templates and frequently parsed pages than to exotic, rarely used ones. Unluckily for servers, the worse, heavier pages often are the most frequently parsed and parsed again too.. as village pump ones; nevertheless they are the most useful for community, so they deserve some servers load. Nevertheless... sometime people tells me don't us this hack, it is server overloading... sometimes it isn't, or simply it is a undocumented, unproofed personal opinion. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Expensive parser function count
2011/1/11 Casey Brown li...@caseybrown.org That's good, but also keep in mind that, generally, you shouldn't worry too much about performance: http://en.wikipedia.org/wiki/WP:PERF. (Had to throw in the little disclaimer here. ;-)) Yes I got this suggestion but when I try new tricks and new ideas, someone often tells me Please stop! This is server overloading! It's terribly heavy!; when I try to go deeper into server overload details, other tell me Don't worry so much about performance. This is a little confusing for a poor DIY contributor ^__^ Nevertheless, some of my ideas will spread over the *whole* it.source project by imitation and by bot activity (so that any my mistake could really have a significant effect; small... it.source is nothing when compared with the whole set of wiki projects...), and there's the risk that a little bit of my ideas could spread into other projects too, so I try to be careful. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Expensive parser function count
Browsing the html code of source pages, I found this statement into a html comment: *Expensive parser function count: 0/500* I'd like to use this statement to evaluate lightness of a page, mainly testing the expensiveness of templates into the page but: in your opinion, given that the best would be a 0/500 value, what are limits for a good, moderately complex, complex page, just to have a try to work about? What is a really alarming value that needs fast fixing? And - wouldn't a good idea to display - just with a very small mark or string into a corner of the page - this datum into the page, allowing a fast feedback? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)
I apologyze, I sent an empty reply. :-( Just a brief comment: there's no need of seaching for a perfect wiki syntax, since it exists: it's the present model of well formed markup, t.i. xml. While digging into subtler troubles from wiki syntax, t.i. difficulties in parsing it by scripts or understanding fuzzy behavior of the code, I always find a trouble coming from tha simple fact, that wiki is a markup that isn't intrinsecally well formed - it doen't respect the simple, basic rules of a well formed syntax: strict and evident rules about beginning-ending of a modifier; no mixing of attributes and content inside its tags, t.i. templates. In part, wiki markup can be hacked to take a step forward; I'm using more and more well formed templates, splitted into two parts, a starting template and an ending template. Just a banal example: it.source users are encouraged to use {{Centrato!l=20em}} text .../div syntax, where text - as you see - is outside the template, while the usual syntax {{Centrato| text ... |l=20em}} mixes tags and contents (Centrato is Italian name of center and l attribute states the width of centered div). I find such a trick extremely useful when parsind text, since - as follows by the use of a well-formed marckup - I can retrieve the whole text simply removing any template code and any html tag; an impossible task using the common not well formed syntax, where nothing tells about the nature of parameters: they only can be classified by human understanding of the template code or by the whole body of wiki parser. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
Can I suggest a really simple trick to inject something new into stagnating wikipedia? Simply install Labeled Section Trasclusion into a large pedia project; don't ask, simply install it. If you'd ask, typical pedian boldness would raise a comment Thanks, we don't need such a thing for sure. They need it... but they don't know, nor they can admit that a small sister project like source uses currently something very useful. Let they discover the #lst surprising power. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
2011/1/4 Roan Kattouw roan.katt...@gmail.com Just from looking at the LST code, I can tell that it has at least one performance problem: it initializes the parser on every request. This is easy to fix, so I'll fix it today. I can also imagine that there would be other performance concerns with LST preventing its deployment to large wikis, but I'm not sure of that. Excellent, I'm a passionate user of #lst extension, and I like that its code can be optimized (so I feel combortable to use it more and more). I can't read php, and I take this opportunity to ask you: 1. is #lsth option compatible with default #lst use? 2. I can imagine that #lst simply runs as a substring finder, and I imagine that substring search is really an efficient, fast and resource-sparing server routine. Am I true? 3. when I ask for a section into a page, the same page is saved into a cache, so that next calls for other sections of the same page are fast and resource-sparing? What a creative use of #lst allows, if it is really an efficient, light routine, is to build named variables and arrays of named variables into one page; I can't imagine what a good programmer could do with such a powerful tool. I'm, as you can imagine, far from a good programmer, nevertheless I built easily routines for unbeliavable results. Perhaps, coming back to the topic. a good programmer would disrupt wikipedia using #lst? :-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
2011/1/4 Roan Kattouw roan.katt...@gmail.com What a creative use of #lst allows, if it is really an efficient, light routine, is to build named variables and arrays of named variables into one page; I can't imagine what a good programmer could do with such a powerful tool. I'm, as you can imagine, far from a good programmer, nevertheless I built easily routines for unbeliavable results. Perhaps, coming back to the topic. a good programmer would disrupt wikipedia using #lst? :-) Using #lst to implement variables in wikitext sounds like a terrible hack, similar to how using {{padleft:}} to implement string functions in wikitext is a terrible hack. Thanks Roan, your statement sound very alarming for me; I'll open a specific thread about into wikisource-l quoting this talk. I'm doing any efford to avoid server/history overload, since I know that I am using a free service (I just fixed {{loop}} template to optimize it into it.source, at my best...) and if you are right, I've to change deeply my approach to #lst. :-( Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] WikiCreole (was Re: What would be a perfect wiki syntax? (Re: WYSIWYG))
2011/1/4 Brion Vibber br...@pobox.com: Indeed, Google Docs has an optimized editing UI for Android and iOS that focuses precisely on making it easy to make a quick change to a paragraph in a document or a cell in a spreadsheet (with concurrent editing). http://www.intomobile.com/2010/11/17/mobile-edit-google-docs-android-iphone-ipad/ A little bit of OT: try the new image vector image editor of Google Docs; it exports images into svg format, and I found it excellent to build such images and to upload them into Commons. Now a free roaming thought about templates, just to share an exotic idea. The main issue of template syntax, is casual, free, unpredictable mixture of attributes and contents into template parameters. It's necessary, IMHO, to convert them into somehow well formed structures so that content could pulled out from the template code. This abstract structure could be this one: {{template name begin|param1|param2|...}} {{optional content 1 begin}} text 1 {{optional content 1 end}} {{optional content 2 begin}} text 2 {{optional content 2 end}} . {{template name end}} Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] What would be a perfect wiki syntax? (Re: WYSIWYG)
2011/1/4 Rob Lanphier ro...@robla.net On Mon, Jan 3, 2011 at 5:54 PM, Chad innocentkil...@gmail.com wrote: On Mon, Jan 3, 2011 at 8:41 PM, Rob Lanphier ro...@wikimedia.org wrote: If, for example, we can build some sort of per-revision indicator of markup language (sort of similar to mime type) which would let us support multiple parsers on the same wiki, then it would be possible to build alternate parsers that people could try out on a per-article basis (and more importantly, revert if it doesn't pan out). The thousands of MediaWiki installs could try out different syntax options, and maybe a clear winner would emerge. Or you end up supporting 5 different parsers that people like for slightly different reasons :) Yup, that would definitely be a strong possibility without a disciplined approach. However, done correctly, killing off fringe parsers on a particular wiki would be fairly easy to do. Just because the underlying wiki engine allows for 5 different parsers, doesn't mean a particular wiki would need to allow the creation of new pages or new revisions using any of the 5. If we build the tools that allow admins some ability to constrain the choices, it doesn't have to get too out of hand on a particular wiki. If we were to go down this development path, we'd need to commit ahead of time to be pretty stingy about what we bless as a supported parser, and brutal about killing off support for outdated parsers. Rob ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
2010/12/31 Conrad Irwin conrad.ir...@gmail.com Evolution is the best model we have for how to build something, the way to keep progress going is to continually try new things; if they fail, meh, if they succeed — yay! Just to add a little bit of pure theory into the talk, wiki project is simply one of the most interesting, and successful, models of adaptive complex systems theory. I encourage anyone to take a deeper look into it. It's both interesting for wiki users/sysops/high level managers and for complex systems researchers. I guess, complex system theory wuld suggest too politics. Just an example: as in evolution, best environment where something new appears is not the wider environment, but the small ones, the islands, just like Galapagos in evolution! This would suggest a great attention about what happens into smaller wiki projects. I guess, the most interesting things could be found there, while not so much evolution can be expected into the mammoth project. ;-) Alex (from it.source) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
2010/12/30 Neil Kandalgaonkar ne...@wikimedia.org On 12/29/10 7:26 PM, Tim Starling wrote: Making editing easier could actually be counterproductive. If we let more people past the editing interface barrier before we fix our social problems, [...] This is an interesting insight! Yes it's really interesting and highlighting! I'm following another talk about StringFunctions; and I recently got an account into toolserver (I only hope that my skill is merely sufficient!). In both cases, there's an issue of security by obscurity. I hate it at beginning, but perhaps such an approach is necessary, it's the simplest way to get a very difficult result. So, what's important is, the balance between simplicity and complexity, since this turns out into a contributor filter. At the beginning, wiki markup has been designed to be very simple. A very important feature of markup has been sacrificed: the code is not well formed. There are lots of simple, but ambiguous tags (for bold and italic characters, for lists); tags don't need to be closed; text content and tags/attributes are mixed freely into the template code. This makes simpler their use but causes terrible quizzes for advanced users facing with unusual cases or trying to parse wikitext by scripts or converting wikitext into a formally well formed markup. My question is: can we imagine to move a little bit that balance accepting a little more complexity and to think to a well formed wiki markup? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
2010/12/29 MZMcBride z...@mzmcbride.com Neil Kandalgaonkar wrote: Let's imagine you wanted to start a rival to Wikipedia. Assume that you are motivated by money, and that venture capitalists promise you can be paid gazillions of dollars if you can do one, or many, of the following: 1 - Become a more attractive home to the WP editors. Get them to work on your content. 2 - Take the free content from WP, and use it in this new system. But make it much better, in a way Wikipedia can't match. 3 - Attract even more readers, or perhaps a niche group of super-passionate readers that you can use to build a new community. In other words, if you had no legacy, and just wanted to build something from zero, how would you go about creating an innovation that was disruptive to Wikipedia, in fact something that made Wikipedia look like Friendster or Myspace compared to Facebook? And there's a followup question to this -- but you're all smart people and can guess what it is. It's simply evolution rule! The day this would happen - that something will appear, collecting all the best from wiki, adding too something better and successful - wiki will slowly disappear. But all the better of wiki will survive into the emerging species where's the problem, you you don't consider wiki in terms of competition, but in terms of utiliy? I'm actively working for wiki principles, not at all for wiki project! I hope, that this will be not considered offensive for wiki community. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions on enwiki?
2010/12/29 Maciej Jaros e...@wp.pl @2010-12-28 22:22, MZMcBride: Alex Brollo wrote: I too don't understand precisely why string functions are so discouraged. I saw extremely complex templates built just to do (with a high server load I suppose in my ignorance...) what could be obtained with an extremely simple string function. https://bugzilla.wikimedia.org/show_bug.cgi?id=6455#c92 (and subsequent comments) I would say en wiki admins simply need to remove such abuse form things like: http://en.wikipedia.org/w/index.php?title=Template:Asbox/templatepageaction=edit This is just a quick example I found - maybe the category was needed at some point, but you can do the same working on a dump or with a bot and just get it done once. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions on enwiki?
@2010-12-28 22:22, MZMcBride: https://bugzilla.wikimedia.org/show_bug.cgi?id=6455#c92 (and subsequent comments) I read almost all that talk but I'm far from satisfied. I know (and I meet sometimes) that there are tricks to emulate some string function by very complex, and I suppose, much server-loading templates; but such emplates can be built from advanced users only (that are few) while they can't be written by normal users (that are many), It likes to me simply as another case of safety through obscurity: is this approach consistent with wiki philosophy? Is it a good idea to remove a tool-1 simply to avoid its abuse? Wouldn't it much better to have both the tool-1, and another tool-2 designed to avoid the abuse of tool-1? Is wiki project as a whole to be removed, just since there are definitely many tries to abuse of it, requiring a great effort of developers, sysops, bots, and users, to fight against those tries? I don't understand. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] StringFunctions on enwiki?
I too don't understand precisely why string functions are so discouraged. I saw extremely complex templates built just to do (with a high server load I suppose in my ignorance...) what could be obtained with an extremely simple string function. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] API vs data dumps
2010/11/7 Andrew Dunbar hippytr...@gmail.com On 14 October 2010 09:37, Alex Brollo alex.bro...@gmail.com wrote: Hi Alex. I have been doing something similar in Perl for a few years for the English Wiktionary. I've never been sure on the best way to store all the index files I create especially in code to share with other people like I would like to happen. If you'd like to collaborate or anyone else for that matter it would be pretty cool. You'll find my stuff on the Toolserver: https://fisheye.toolserver.org/browse/enwikt Thanks Andrew. I just got a toolserver account, but don't seach for any contribution by me... I'm very worried about the whole stuff and needed skills. :-( Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] InlineEditor new version (previously Sentence-Level Editing)
2010/10/25 Jan Paul Posma jp.po...@gmail.com Hi all, As presented last Saturday at the Hack-A-Ton, I've committed a new version of the InlineEditor extension. [1] This is an implementation of the sentence-level editing demo posted a few months ago. Very interesting! Obviously I'll not see your work till it will be implemented into Wikipedia and all other Wikimedia Foundation projects. Please consider too specific needs of sister projects, t.i. poem extensionhttp://www.mediawiki.org/wiki/Extension:Poemused by wikisource and its poem... /poem tags; I guess that any sister project has something particular to be considered from the beginning of any work about a new editor. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] API vs data dumps
2010/10/13 Paul Houle p...@ontology2.com Don't be intimidated by working with the data dumps. If you've got an XML API that does streaming processing (I used .NET's XmlReader) and use the old unix trick of piping the output of bunzip2 into your program, it's really pretty easy. When I worked into it.source (a small dump! something like 300Mby unzipped), I used a simple do-it-yourself string python search routine and I found it really faster then python xml routines. I presume that my scripts are really too rough to deserve sharing, but I encourage programmers to write a simple dump reader using speed of string search. My personal trick was to build an index, t.i. a list of pointers to articles and name of articles into xml file, so that it was simple and fast to recover their content. I used it mainly because I didn't understand API at all. ;-) Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Table and data parsing extension
2010/10/8 Dmitriy Sintsov ques...@rambler.ru * Wikirating Team t...@wikirating.org [Thu, 07 Oct 2010 22:10:32 +0200]: Hi there, This is my first post on the wikitech forum and I hope I'm posting it correctly... Hi! You may also take a look at [[Extension:Semantic MediaWiki]] and related Semantic* extensions. Dmitriy In it.wikisource a DIY solution for this is to use labeled section transclusion extension to build named variable arrays. It runs, but - as I told - it's merely a DIY trick, our goal is to implement [[Extension:Semantic MediaWiki]], the solution. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] A namespace for pages without chronology
Special pages, if I understand all their features, are special why: # they come from a live API query; # they cannot be managed/created/edited by users; # they have no chronology (it would be nonsense). It.source uses many list pages, daily updated by a bot, containing other project-specific queries. They are normal pages, and their chronology is bot useless and heavy. DynamicPageList extension could solve in part such a useless overload of web space, but its output can't be finely tuned. So, I imagine that it could be useful to have a special namespace for customable user-defined queries lists, with only one special feature: the lack of chronology stuff. I can imagine that a possible candidate for such exotic, chronology-free pages could be Special: namespace itself; obviously the name of user-created pages into Special: namespace should be different from any canonical Special pages. Am I mad? Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] A namespace for pages without chronology
2010/10/7 Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com It.source uses many list pages, daily updated by a bot, containing other project-specific queries. They are normal pages, and their chronology is bot useless and heavy. DynamicPageList extension could solve in part such a useless overload of web space, but its output can't be finely tuned. Storing many revisions in history is not expensive. Don't worry about it. There's no way any users, even admins, will ever be permitted to update pages without leaving any history -- it violates the principle of reversibility that underlies how wikis work. Is there some problem you have with these list pages other than the fact that they have lots of history that no one cares about? If not, this is useful to read: Thank you. Really I went back to http://stats.wikimedia.org/wikisource/EN/TablesWikipediaIT.htm, and list pages (Elenco...) have a history of less than 2Mby each. You're right. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] A namespace for pages without chronology
2010/10/7 Platonides platoni...@gmail.com Really I went back to http://stats.wikimedia.org/wikisource/EN/TablesWikipediaIT.htm, and list pages (Elenco...) have a history of less than 2Mby each. You're right. If you want to reduce history size, you should begin by removing date-changing edits [1]. If you really need to show a date there, you can include a template from all the pages containing just the date where all of them were last updated. 1- Eg. no changes on last 16 days, but the bot is dutifully copying the page each day http://it.wikisource.org/w/index.php?title=Wikisource%3AElenco_alfabetico_degli_autori_Aaction=historysubmitdiff=662309oldid=638167 Thanks! I didn't so far studied that code, nor I'm the driver of that bot, but you're perfectly right! I'll edit it so that the page will be edited only when the list is changed! Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l