Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On Fri, Jan 21, 2011 at 5:30 AM, Tim Starling tstarl...@wikimedia.org wrote: On 21/01/11 12:46, Trevor Parscal wrote: Joke or not, it's in there, and it's a violation of the GPL. Did you try emailing the author and asking for a dual license? I believe that people from Redhat have already tried that and failed. Bryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
Roan Kattouw wrote: 2011/1/21 Platonides platoni...@gmail.com: Conceptually, revision table shouldn't link to file_props. file_props should be linked with image instead. Maybe, but the current image/oldimage schema resembling cur/old is horrible. For instance, there is no way to uniquely identify an oldimage row. I agree. It should also be fixed. We talked about this for an hour and decided that we have some ideas for restructuring that, but that it's a huge operation that shouldn't block the license integration project. Roan Kattouw (Catrope) If we wanted to map it to a page/revision format, it seems quite straightforward. I'm missing something, right? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
The interest of wikisource project for a formal and standardyzed set of book metadata (I presume from Dublin Core) into a database table is obviuos. Some preliminary tests into it.source suggest that templates and Labeled Section Transclusion extension could have a role as existing wikitext conteiners for semantized variables; the latter perhaps more interesting than the former one, since their content can be accessed directly from any page I'd like that book metadata would be considered from the beginning of this interesting project. Alex ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On Thu, Jan 20, 2011 at 8:30 PM, Tim Starling tstarl...@wikimedia.org wrote: Sure, but Trevor is claiming that he wrote it because of the license issue. Since he has publically ranted three times: http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/50082 http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/50910 http://www.mediawiki.org/wiki/Special:Code/MediaWiki/73196#c13027 about how terrible my changes to JSMin.php were, in September, December and January, I can't help but think that there was another motive for this rewrite. I have no problem with Trevor reverting that change, I said so the second time this came up, so I think it's amusing that Trevor needed the excuse of license incompatibility before he actually did something. I'm glad that he's finally found the motivation from somewhere, so that he can stop bothering me with his periodic rants. This is really unnecessary and unhelpful on a public mailing list. I think we'd all be better off if snark like this were kept to private channels. -- Andrew Garrett http://werdn.us/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] From page history to sentence history
On Wed, Jan 19, 2011 at 4:15 PM, Anthony wikim...@inbox.org wrote: No, the question is why the relevant code is totally unrelated. Well, you might ask why we don't just (selectively) dump the page, revision, and text tables instead of doing XML dumps -- it seems like it would be much simpler -- but I have no idea. Perhaps it's to ease processing with non-MediaWiki tools, but I'm not sure why that's a design goal compared to the simplicity of SQL dumps. Surely it wouldn't be too hard to write a maintenance/ tool that just fetches the revision text for a particular article at a particular point, using only those three tables without any MediaWiki framework so it can be used standalone. Not to mention, the text table is immutable, so creating and publishing text table dumps incrementally should be trivial. But I'm not going to criticize anyone from the peanut gallery here. I don't actually know much about the dumps work. Happy-melon is correct to point out that it might not be trivial to snip private info (even oversighted revisions) from the text table, depending on how it's constructed. There might be other concerns too. And there are lots of lower-priority things that are being done. And lots of dollars sitting on the sidelines doing nothing. That's a discussion for foundation-l, not wikitech-l. On Thu, Jan 20, 2011 at 4:04 AM, Anthony wikim...@inbox.org wrote: It wouldn't be trivial, but it wouldn't be particularly hard either. Most of the work is already being done. It's just being done inefficiently. I'm glad to see you know what you're talking about here. Presumably you've examined the relevant code closely and determined exactly how you'd implement the necessary changes in order to evaluate the difficulty. Needless to say, patches are welcome. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] MATH markup question
On Wed, Jan 19, 2011 at 11:56 PM, Carl (CBM) cbm.wikipe...@gmail.com wrote: The ideal solution for Wikipedia would be to move to a system in which users with relatively modern browsers don't see images at all. There is already a candidate for that system: MathJax. This has extensive browser compatibility [1] and is actively maintained, with some big-name sponsors behind it [2]. The main difficulties enabling it on WIkipedia would be configuration and checking for any inconsistencies with texvc (so the main limitation is developer interest). When I load their homepage, the formulas don't appear for about two seconds of 100% CPU usage, on Firefox 4b9. And that's for two small formulas. I'm not impressed. IMO, the correct way forward is to work on native MathML support -- Gecko and WebKit both support it these days, and Opera somewhat does too. I'm sure the support is a bit spotty, but if Wikipedia used it (even as an off-by-default option) that would surely drive a lot of progress. These days (with the deployment of HTML5 parsers) it can be embedded directly into HTML, it's not limited to XML. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On Fri, Jan 21, 2011 at 10:49 AM, Andrew Garrett agarr...@wikimedia.org wrote: This is really unnecessary and unhelpful on a public mailing list. I think we'd all be better off if snark like this were kept to private channels. Agreed. Or better yet, not said at all. Since we evidently no longer have a benevolent dictator whose word we all accept without question, we need to work amicably to resolve disputes, and getting into fights about 1% size savings vs. dubious increases in readability is really not useful. At least get into fights about something *significant*. (FWIW, I was and am against removing *all* newlines from JS output, but I'm fine with collapsing consecutive newlines. This way errors will actually point to a recognizable line.) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] helping in WYSIWYG editor efforts
So a few minutes ago we've had a conversation about this. Panos will set up a public collaboration space within GRNET. A few developers will be (part-time) working on this from February for a (so far) unspecified amount of time. The consensus was that it would be good to start off with some basic usability testing, to see how well the different tools work for novice users. It'll be very basic testing, with about 10 subjects from within GRNET (so with a bit of technical bias) but only those who haven't edited before. Both Magnus' and my tools will be implemented on a clone of the Greek Wikipedia and we will set up a fabricated article that works well with both of our editors. It's only about the usability, not about technical aspects for now. Both editing tools will have to be adapted and localised, perhaps this can even be done by GRNET developers. We'll use my usability script that I used before with the Sentence-Level Editing usability research. Once this usability testing has been done, we'll decide how to distribute the efforts, and what will be done. We'll work closely with the GRNET developers to assist them in working on these projects. Once we'll have more information it will be posted to this list. Cheers, Jan Paul On 19-Jan-2011, at 23:34, Magnus Manske wrote: I have added Panos to Skype; yes, we should probably exchange Skype handles off-list. I am in Cambridge (London time), so that should work. Cheers, Magnus On Wed, Jan 19, 2011 at 8:34 PM, Jan Paul Posma jp.po...@gmail.com wrote: Skype sounds great! Also, I heard you work with Ariel, which is great because that way you have a more local person to contact with MediaWiki questions. Perhaps we can get off-list with those interested to schedule an introductory meeting? (You, me, Magnus, Ariel, others?) I am located in the Netherlands, so our hours will be similar. Cheers, Jan Paul On 19-Jan-2011, at 19:47, Panos Louridas wrote: Thanks to both Jean Paul and Magnus for taking up the offer! Based on your input I will look into our developer tool for people with expertise in the following: * Advanced JS, preferably with experience in optimisation issues etc. * UI design, usability testing, etc. * Text processing (of sorts) for the needs of SLE (if you believe I am missing something, say so) I expect to have the people in place in February, I will let you know. I will be following the list. Jean Paul indicated that we might talk in more detail. I do not follow IRC because of my tight schedule; I do use Skype, however (ID: louridas). Please Jean Paul, Magnus, and others, let me know if that suits you. As I am located in Athens, my waking hours are around East European Time. Cheers, Panos. On Jan 19, 2011, at 3:54 PM, Jan Paul Posma wrote: A very generous offer indeed! My own SLE and Magnus' WYSIFTW are indeed the most active projects, so that would be a good bet. Actually, for me the timing is just right, as I'll be working on a paper about this editor for a while, so it'd be cool to have someone(s) continue the project. If one of your researchers has a brilliant idea on how to do this right, that would obviously be really valuable too. A lot of things Magnus mentioned apply to my project too: * Improving detection algorithms, i.e. better sentence-level editing (perhaps using an external language recognition library), better detection of other elements. Keep in mind that the editor excludes anything it doesn't 'understand', so this is a nice fallback, you don't have to write a complex parser that detects a lot of stuff at once. * Cross-browser/platform/device compatibility (think mobile, touchscreens, etc.) * Usability testing (the more the merrier!) * Verifying detection coverage (Which % of the wikitext is editable) and quality (Wikitext - Adding markers - MediaWiki parser - Removing markings - Wikitext??) Checking this on a large number of pages. * Test suites (again, the more the merrier, but only for parts of the code and interface that are considered stable!) * Lots of implementation details: embedding the (current) editor toolbar in the textboxes, making sure (a fair percentage of) gadgets still work with this, and handling unusual cases like edit conflicts, etc. Perhaps it'd be good to have a (video or IRC?) conversation with you, your developers, people from the Foundation, and people from the specific projects you want to contribute to. Again, really awesome that you guys want to work on this! :-) Best regards, Jan Paul On 19-Jan-2011, at 9:55, Magnus Manske wrote: On Mon, Jan 17, 2011 at 1:47 PM, Panos Louridas louri...@grnet.gr wrote: Hi, At the Greek Research and Education Network (GRNET) we look at the possibility of contributing to the development of WYSIWYG editor support in Wikipedia. We understand that considerable work has already taken place in the area, e.g.: *
Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On Fri, Jan 21, 2011 at 7:21 AM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: On Fri, Jan 21, 2011 at 10:49 AM, Andrew Garrett agarr...@wikimedia.org wrote: This is really unnecessary and unhelpful on a public mailing list. I think we'd all be better off if snark like this were kept to private channels. Agreed. Or better yet, not said at all. Since we evidently no longer have a benevolent dictator whose word we all accept without question, we need to work amicably to resolve disputes, and getting into fights about 1% size savings vs. dubious increases in readability is really not useful. At least get into fights about something *significant*. I agree. Talk about the code, not the committer. But I really don't think we need a benevolent dictator. Brion never really had to step in and play traffic cop; we've all gotten along and worked well over the years without generally needing one and I see no reason why that needs to change now. And yes, I think we should keep the arguments confined to major things. Ok everyone stop for a second. We're arguing about a few lines of vertical whitespace. Full stop. Lets argue about complex architecture changes or something else worthwhile. It's really not worth it to waste so many bytes (and my e-mail now just adds to the pile, I know) over something so incredibly trivial. Back to the subject at hand... While I happen to think the licensing issue is rather bogus and doesn't really affect us, I'm glad to see it resolved. It outperforms our current solution and keeps the same behavior. Plus as a bonus, the vertical line smushing is configurable so if we want to argue about \n a year from now, we can :) -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On 21/01/11 23:21, Aryeh Gregor wrote: On Fri, Jan 21, 2011 at 10:49 AM, Andrew Garrett agarr...@wikimedia.org wrote: This is really unnecessary and unhelpful on a public mailing list. I think we'd all be better off if snark like this were kept to private channels. Agreed. Or better yet, not said at all. Since we evidently no longer have a benevolent dictator whose word we all accept without question, we need to work amicably to resolve disputes, and getting into fights about 1% size savings vs. dubious increases in readability is really not useful. At least get into fights about something *significant*. Why does everyone think I care about the issue when I keep saying that I don't? My email was an expression of frustration at the fact that Trevor keeps attacking me about the change I made despite the fact that I made it abundantly clear last time around that I don't care about it and don't want to hear anything more about it. -- Tim Starling ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] From page history to sentence history
On Fri, Jan 21, 2011 at 6:48 AM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: Not to mention, the text table is immutable, so creating and publishing text table dumps incrementally should be trivial. The problem there is deletion and oversight. The best solution if you didn't have to worry about that would be to have a database on the dump servers with only public data, which accesses a live feed (over the LAN). Then creating a dump would be as simple as pg_dump, and fancier incremental dumps could be made relatively simply as well. Then again, if your live feed tells you which revisions to delete/oversight, that's still a viable solution. On Thu, Jan 20, 2011 at 4:04 AM, Anthony wikim...@inbox.org wrote: It wouldn't be trivial, but it wouldn't be particularly hard either. Most of the work is already being done. It's just being done inefficiently. I'm glad to see you know what you're talking about here. Presumably you've examined the relevant code closely and determined exactly how you'd implement the necessary changes in order to evaluate the difficulty. Needless to say, patches are welcome. Access to the servers is welcome. I can't possibly test and improve performance without it. Alternatively, give me a free live feed, and I'll make a decent dump system here at home, and provide the source code when I'm done. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On 01/21/2011 08:21 AM, Chad wrote: While I happen to think the licensing issue is rather bogus and doesn't really affect us, I'm glad to see it resolved. It outperforms our current solution and keeps the same behavior. Plus as a bonus, the vertical line smushing is configurable so if we want to argue about \n a year from now, we can :) Ideally we will be using closures by then and since it rewrites functions, variable names and sometimes collapses multi-line functionality, new line preservation will be a mute point. Furthermore, Google even has a nice add-on to firebug [1] for source code mapping. Making the dead horse even more dead. I feel like we are suck back in time, arguing about optimising code that came out eons ago in net time ( more than 7 years ago ) There are more modern solutions that take into consideration these concerns and do a better job at it. ( ie not just a readable line but a pointer back to the line of source code that is of concern ) [1] http://code.google.com/closure/compiler/docs/inspector.html peace, --michael ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!
On 22/01/11 02:49, Aaron Schulz wrote: This sounds like thinking out loud (not to say whether it's true or false). It seems like there just has to be some better, more private, means to discuss things like this... Fair enough. Apologies to the list. -- Tim Starling ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
On 01/21/2011 02:45 AM, Alex Brollo wrote: The interest of wikisource project for a formal and standardyzed set of book metadata (I presume from Dublin Core) into a database table is obviuos. Some preliminary tests into it.source suggest that templates and Labeled Section Transclusion extension could have a role as existing wikitext conteiners for semantized variables; the latter perhaps more interesting than the former one, since their content can be accessed directly from any page I'd like that book metadata would be considered from the beginning of this interesting project. Alex This quickly dove tails into Semantic MediaWiki discussion... which there are other threads on this list to reference. There is a wiki data summit / meeting coming up, where these issues will likely be discussed. Maybe we could start eliciting requirements and needs of projects like what you describe for wikisource and others that have been listed elsewhere on a pre-meeting project page, this way we can be sure to hit on all these items during the meeting. --michael ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] New committer: Apekshit Sharma (appy)
Hi everyone, Earlier this week, I added Apekshit Sharma (appy) as a committer in extensions-only for work on Article Highlight: http://www.mediawiki.org/wiki/Extension:Article_Highlight Welcome appy! Rob ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
Hello, As you may have noticed, Roan, Krinkle and me have started to more tightly integrate image licensing within MediaWiki. Our aim is to create a system where it should be easy to obtain the basic copyright information of an image in a machine readable format, as well as querying images with a certain copyright state (all images copyrighted by User:XY, all images licensed CC-BY-SA, etc) At this moment we only intend to store author and license information, but nothing stops us from expanding this in the future. We have put some information in a not so structured way at mw.org [1]. There are some issues open on the talk page [2]. Input is of course welcome, both here or preferably at the talk page. Bryan [1] http://www.mediawiki.org/wiki/Files_and_licenses_concept [2] http://www.mediawiki.org/wiki/Talk:Files_and_licenses_concept Has there been consideration given to translating author names into different languages? Relative to other types of metadata, having the author in different languages is not as important, since most people just use whatever the name in the author's native language is (or at least, that is what experience suggests to me). However, we might want to have different translations of the authors names in some circumstances: *If the author is 'Unknown' or 'Anonymous', we'd definitely want to be able to translate that. *If the author is a company, government or a group with a proper name, people tend to translate the name. *If the author's native language is in a different script then the current language, then the author's name is usually translated in my experience. (Since to the average English viewer, a name in a language like Arabic or Tamil that doesn't use the Latin alphabet, generally look like any other name in that language I imagine people who only speak Arabic would have trouble differentiating between the written form of different English names). (Of course, the above is just a guess for one you'd want to translate author names, I don't know what happens in actual practise). So I do think allowing such author properties to have multiple translations is something to consider. If there was support for translations of the values of these properties, ideally when querying this information from the api - we'd want to be able to do things like get the author's name in language X, falling back to the original language if unavailable. Get the author's name in all available languages, etc. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
2011/1/21 Platonides platoni...@gmail.com: If we wanted to map it to a page/revision format, it seems quite straightforward. I'm missing something, right? You're missing that migrating a live site (esp. Commons, with 8 million image rows and ~750k oldimage rows) from the old to the new schema would be a nightmare, and would probably involve setting stuff to read-only for a few hours. Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
On Fri, Jan 21, 2011 at 10:43 AM, Roan Kattouw roan.katt...@gmail.comwrote: 2011/1/21 Platonides platoni...@gmail.com: If we wanted to map it to a page/revision format, it seems quite straightforward. I'm missing something, right? You're missing that migrating a live site (esp. Commons, with 8 million image rows and ~750k oldimage rows) from the old to the new schema would be a nightmare, and would probably involve setting stuff to read-only for a few hours. If one's clever about it, this could probably actually be done on-the-fly in a reasonably non-evil fashion. Image version data isn't used as widely as revisions; eg things like Special:Contributions always needed direct access to old revs looked up by author, whereas I think image old versions are pretty much only pulled up by title, via the image record. There are also relatively few revisions per file -- old images usually only have a few revisions, and cases of thousands of versions are I suspect very rare -- which would make the actual conversion work relatively lightweight for each file record. Further optimizing by delaying on-demand migration of a record until write time could also keep it from being a sudden database i/o sink. If indirect lookups won't be needed, we can just keep reading the existing image/oldimage records until they need to be updated on modification (or get updated by a background task at leisure). -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] helping in WYSIWYG editor efforts
On Fri, Jan 21, 2011 at 6:07 AM, Jan Paul Posma jp.po...@gmail.com wrote: So a few minutes ago we've had a conversation about this. Panos will set up a public collaboration space within GRNET. A few developers will be (part-time) working on this from February for a (so far) unspecified amount of time. The consensus was that it would be good to start off with some basic usability testing, to see how well the different tools work for novice users. It'll be very basic testing, with about 10 subjects from within GRNET (so with a bit of technical bias) but only those who haven't edited before. Both Magnus' and my tools will be implemented on a clone of the Greek Wikipedia and we will set up a fabricated article that works well with both of our editors. It's only about the usability, not about technical aspects for now. Both editing tools will have to be adapted and localised, perhaps this can even be done by GRNET developers. We'll use my usability script that I used before with the Sentence-Level Editing usability research. Once this usability testing has been done, we'll decide how to distribute the efforts, and what will be done. We'll work closely with the GRNET developers to assist them in working on these projects. Once we'll have more information it will be posted to this list. I know I tend to be down on markup-based editors lately but I'm **super-excited** that there's a bunch of folks actively working on trying these things out! Participatory open source at its finest. :) Definitely interested to see the usability testing... -- brion ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
Roan Kattouw wrote: 2011/1/21 Platonides platoni...@gmail.com: If we wanted to map it to a page/revision format, it seems quite straightforward. I'm missing something, right? You're missing that migrating a live site (esp. Commons, with 8 million image rows and ~750k oldimage rows) from the old to the new schema would be a nightmare, and would probably involve setting stuff to read-only for a few hours. Roan Kattouw (Catrope) Do we agree in the target db schema? That's the important point. Migrating a large site like commons is 'just' an operations issue. Making it readonly a bit wouldn't be a big issue, but could also for instance move to an intermediate point, where uploads are stored in both formats, while read only in the old one, while a script is moving records. Finally, flip the switch and drop the old tables. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] sites using our bandwidth
Hello, The squid statistics report show us that some site are leaking our bandwidth. How to tell? They have a huge number of images referral and barely none for pages. One example: In December, channelsurfing.net has been seen as a referrer for: - 1000 pages roughly - 1 740 000 images whatchnewfilms.com is 14 000 / 581 000. By looking at their pages, they use upload.wikimedia.org and glue some advertisement around there. Given the cost in bandwidth, hard drives, CPU, architecture ... I do think we should find a solution to block thoses sites as much as possible. Would it be possible at the squid level? http://stats.wikimedia.org/archive/squid_reports/2010-12/SquidReportOrigins.htm -- Ashar Voultoiz ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
2011/1/21 Platonides platoni...@gmail.com: Do we agree in the target db schema? That's the important point. We haven't thought about it in detail. But it would be a fairly large change and require changes throughout the software, as well as possibly elsewhere in the schema. Migrating a large site like commons is 'just' an operations issue. Making it readonly a bit wouldn't be a big issue, but could also for instance move to an intermediate point, where uploads are stored in both formats, while read only in the old one, while a script is moving records. Finally, flip the switch and drop the old tables. Sure, it can be dealt with. It's just that it'd be an epic upgrade :) Roan Kattouw (Catrope) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] sites using our bandwidth
On 21 January 2011 22:49, Ashar Voultoiz hashar+...@free.fr wrote: Given the cost in bandwidth, hard drives, CPU, architecture ... I do think we should find a solution to block thoses sites as much as possible. Would it be possible at the squid level? Given we actively endorse hotlinking (we merely caution against doing it for thumbnails) - http://commons.wikimedia.org/wiki/Commons:Reusing_content_outside_Wikimedia#Hotlinking - is there a problem to solve here? - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] sites using our bandwidth
Ashar Voultoiz wrote: The squid statistics report show us that some site are leaking our bandwidth. How to tell? They have a huge number of images referral and barely none for pages. One example: In December, channelsurfing.net has been seen as a referrer for: - 1000 pages roughly - 1 740 000 images whatchnewfilms.com is 14 000 / 581 000. By looking at their pages, they use upload.wikimedia.org and glue some advertisement around there. Given the cost in bandwidth, hard drives, CPU, architecture ... I do think we should find a solution to block thoses sites as much as possible. Would it be possible at the squid level? You're talking about hotlinking, right? Looking at the page source of channelsurfing.net, they're clearly hotlinking quite a bit. But as David notes, we generally encourage our content to be spread and used. Tim did some investigation into the issue of hotlinking in July 2008. His statistics and some of his findings are here: http://meta.wikimedia.org/w/index.php?oldid=1104187#Statistics To quote Tim directly: [quote] I'll save my comments on the bulk of the proposal for later, but I'll say this now: it's certainly not worth my time (or that of any other system administrator) to deal with these sites on a case-by-case basis. Bandwidth may be valuable, but staff time is also valuable. [/quote] In his view, the costs outweighed any benefit to looking at hotlinking on a case-by-case basis, particularly when you factor in CPU time to process regexes at the Squid level and sysadmin time to monitor and update these records. That having been said, it may make sense to make specific exceptions for statistical outliers in the logs. Of course you can read his comments directly at the linked page and make your cost/benefit analysis. :-) MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] sites using our bandwidth
On Fri, Jan 21, 2011 at 5:02 PM, MZMcBride z...@mzmcbride.com wrote: You're talking about hotlinking, right? Looking at the page source of channelsurfing.net, they're clearly hotlinking quite a bit. But as David notes, we generally encourage our content to be spread and used. I particularly enjoyed the part that their header is hotlinked and hosted on commons. http://commons.wikimedia.org/wiki/File:Cslogo.gif Seems like thats definitively using commons as their image host. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] sites using our bandwidth
On 21 January 2011 23:31, George Herbert george.herb...@gmail.com wrote: If they're linking to images we legitimately host and which meet our image guidelines, are used in WP or other WMF projects, etc, then ... Shrug. I didn't realize we were ok with hotlinking like that, but if that's the published policy, that's the published policy. It's the published guideline. Presumably a techie could change it at any time if it were no longer true. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] sites using our bandwidth
On Fri, Jan 21, 2011 at 6:32 PM, David Gerard dger...@gmail.com wrote: On 21 January 2011 23:31, George Herbert george.herb...@gmail.com wrote: If they're linking to images we legitimately host and which meet our image guidelines, are used in WP or other WMF projects, etc, then ... Shrug. I didn't realize we were ok with hotlinking like that, but if that's the published policy, that's the published policy. It's the published guideline. Presumably a techie could change it at any time if it were no longer true. Default InstantCommons configuration (and suggested manual configuration) includes caching; so sites using it should (mostly) not be hotlinking. However, you can disable the caching in which case you'll be hotlinking the thumbs. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Announcing OpenStackManager extension
For the past month or so I've been working on an extension to manage OpenStack (Nova), for use on the Wikimedia Foundation's upcoming virtualization cluster: http://ryandlane.com/blog/2011/01/02/building-a-test-and-development-infrastructure-using-openstack/ I've gotten to a point where I believe the extension is ready for an initial release. In brief, OpenStack works a lot like EC2, and in fact implements the EC2 API. This extension interacts with the EC2 API and LDAP, to manage a virtual machine infrastructure. It has the following features: * Integrates with the LdapAuthentication extension, and creates user accounts in LDAP upon user creation ** Users created with a posix username, uid, and gid; home directory; openstack credentials; and wiki credentials * Manages most features of OpenStack ** Handles project creation/deletion, and membership ** Handles project and global role membership ** Handles instance creation/deletion ** Handles security group creation/deletion and rule addition/removal ** Handles floating IP address allocation and association with instances ** Handles public SSH key addition/removal from user accounts * Manages DNS via PowerDNS with an LDAP backend ** Handles private DNS for private IP address ranges upon instance creation and deletion ** Handles public DNS for floating IP addresses * Manages Puppet configuration for instances via Puppet with an LDAP backend for nodes The extension was written to handle the case explained in my blog post. It is likely not written in a generic enough way to fit into most existing infrastructures currently. If you'd like to help make the extension more useful for a wider audience, please contact me, send patches, or if you have commit access, make modifications. I have a test/dev environment for this project configured on tesla, if you'd like to work in a pre-configured environment. The extension page is here: http://www.mediawiki.org/wiki/Extension:OpenStackManager Respectfully, Ryan Lane ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] File licensing information support
Roan Kattouw wrote: 2011/1/21 Platonides platoni...@gmail.com: Do we agree in the target db schema? That's the important point. We haven't thought about it in detail. But it would be a fairly large change and require changes throughout the software, as well as possibly elsewhere in the schema. Migrating a large site like commons is 'just' an operations issue. Making it readonly a bit wouldn't be a big issue, but could also for instance move to an intermediate point, where uploads are stored in both formats, while read only in the old one, while a script is moving records. Finally, flip the switch and drop the old tables. Sure, it can be dealt with. It's just that it'd be an epic upgrade :) Roan Kattouw (Catrope) We already have 1.17 branched, so... who dares to create a branch and begin with it? :) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Announcing OpenStackManager extension
I just wanted to add my $0.02 here... Ryan demo'ed this at the West Coast Wiki Conference under the heading The site infrastructure that *you* can edit. He presented it as a way to bring volunteers back into ops, by giving them the power to create and test complex configurations without affecting live deployments. I'm not an ops person but this looked very cool to me. On 1/21/11 3:55 PM, Ryan Lane wrote: For the past month or so I've been working on an extension to manage OpenStack (Nova), for use on the Wikimedia Foundation's upcoming virtualization cluster: http://ryandlane.com/blog/2011/01/02/building-a-test-and-development-infrastructure-using-openstack/ I've gotten to a point where I believe the extension is ready for an initial release. In brief, OpenStack works a lot like EC2, and in fact implements the EC2 API. This extension interacts with the EC2 API and LDAP, to manage a virtual machine infrastructure. It has the following features: * Integrates with the LdapAuthentication extension, and creates user accounts in LDAP upon user creation ** Users created with a posix username, uid, and gid; home directory; openstack credentials; and wiki credentials * Manages most features of OpenStack ** Handles project creation/deletion, and membership ** Handles project and global role membership ** Handles instance creation/deletion ** Handles security group creation/deletion and rule addition/removal ** Handles floating IP address allocation and association with instances ** Handles public SSH key addition/removal from user accounts * Manages DNS via PowerDNS with an LDAP backend ** Handles private DNS for private IP address ranges upon instance creation and deletion ** Handles public DNS for floating IP addresses * Manages Puppet configuration for instances via Puppet with an LDAP backend for nodes The extension was written to handle the case explained in my blog post. It is likely not written in a generic enough way to fit into most existing infrastructures currently. If you'd like to help make the extension more useful for a wider audience, please contact me, send patches, or if you have commit access, make modifications. I have a test/dev environment for this project configured on tesla, if you'd like to work in a pre-configured environment. The extension page is here: http://www.mediawiki.org/wiki/Extension:OpenStackManager Respectfully, Ryan Lane ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Neil Kandalgaonkar (| ne...@wikimedia.org ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l