[Wikitech-l] forking media files
Let me retitle one of the topics nobody seems to touch. On Fri, Aug 12, 2011 at 13:44, Brion Vibber br...@pobox.com wrote: * media files -- these are freely copiable but I'm not sure the state of easily obtaing them in bulk. As the data set moved into TB it became impractical to just build .tar dumps. There are batch downloader tools available, and the metadata's all in dumps and api. Right now it is basically locked: there is no way to bulk copy the media files, including doing simply a backup of one wikipedia, or commons. I've tried, I've asked, and the answer was basically to contact a dev and arrange it, which obviously could be done (I know many of the folks) but that isn't the point. Some explanations were mentioned, mostly mentioning that media and its metadata is quite detached, and thus it's hard to enforce licensing quirks like attribution, special licenses and such. I can guess this is a relevant comment since the text corpus is uniformly licensed under CC/GFDL while the media files are at best non-homogeneous (like commons, where everything's free in a way) and completely chaos at its worst (individual wikipedias, where there may be anything from leftover fair use to copyrighted by various entities to images to be deleted soon). Still, I do not believe it's a good method to make it close to impossible to bulk copy the data. I am not sure which technical means is best, as there are many competing ones. We could, for example, open up an API which would serve media file with its metadata together, possibly supporting mass operations. Still, it's pretty ineffective. Or we could support zsync, rsync and such (and I again recommend examining zsync's several interesting abilities to offload the work to the client), but there ought to be some pointers to image metadata, at least an oneliner file with every image linking to the license page. Or we could connect the bulk way to established editor accounts, so we could have at least a bit of an assurance that s/he knows what s/he's doing. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] forking media files
On Mon, Aug 15, 2011 at 18:40, Russell N. Nelson - rnnelson rnnel...@clarkson.edu wrote: The problem is that 1) the files are bulky, That's expected. :-) 2) there are many of them, 3) they are in constant flux, That is not really a problem: since there are many of them statistically they are not in flux. and 4) it's likely that your connection would close for whatever reason part-way through the download.. I seem not to forgot to mention zsync/rsync. ;-) Even taking a snapshot of the filenames is dicey. By the time you finish, it's likely that there will be new ones, and possible that some will be deleted. Probably the best way to make this work is to 1) make a snapshot of files periodically, Since I've been told they're backed up it naturally should exist. 2) create an API which returns a tarball using the snapshot of files that also implements Range requests. I would very much prefer ready-to-use format instead of a tarball, not to mention it's pretty resource consuming to create a tarball just for that. Of course, this would result in a 12-terabyte file on the recipient's host. That wouldn't work very well. I'm pretty sure that the recipient would need an http client which would 1) keep track of the place in the bytestream and 2) split out files and write them to disk as separate files. It's possible that a program like getbot already implements this. I'd make a snapshot without tar especially because partial transfers aren't possible that way. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] XKCD: Extended Mind
On Thu, May 26, 2011 at 17:38, Leo Koppelkamm diebu...@gmail.com wrote: http://ryanelmquist.com/cgi-bin/xkcdwiki Nice way to see that first sentences eventually lead to a general quantity or property which links to [[property (phylosophy)]] which links to Philosophy itself. So far I didn't see a way which wasn't following 'property'. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] XKCD: Extended Mind
On Wed, May 25, 2011 at 09:24, Tim Starling tstarl...@wikimedia.org wrote: On 25/05/11 17:05, Domas Mituzas wrote: On May 25, 2011, at 9:35 AM, K. Peachey wrote: http://xkcd.org/903/ -Peachey that error is fake! 10.0.0.242 is internal services DNS server and is not used to serve en.wikipedia.org - dberror log does not have a single instance of it! 10.0.6.42 on the other hand I would have thought the fact that it was hand drawn would have given it away. But in this particular case hand drawn doesn't mean facts can slip. At least these drawings are usually extremely precise. (You can see which pulldowns he usually keep open. :-)) I second Domas to check because there may be a super secret conspiracy and the drawing may be correct. ;-) -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] XKCD: Extended Mind
On Wed, May 25, 2011 at 17:16, Domas Mituzas midom.li...@gmail.com Thanks for clearing that up. Nice work. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] What is wrong with Wikia's WYSIWYG?
On Mon, May 2, 2011 at 10:02, Tim Starling tstarl...@wikimedia.org wrote: don't think using wikitext is the best way to make things easier for new users. It's always been a dilemma for me to see how much amount of computer illiterate users should be wished for (or actually possible to tolerate). I don't feel that web number dot number (whatever version we call it) is fast, reliable, useful enough to be used as the main way to input encyclopedia text. These usually very slow, and quite unreliable (including google docs stuff which is I believe the most advanced tech out here in this topic). And... People habitually completely get lost in DTP software (be that [open/whatever]office or else), they can't comprehend formatting, fonts, text annotation and other advanced features. I do not see that WYSIWYG would've made them more able to use the techniques. There are some guys who actually learned enough markup to completely screw up wikibooks (putting flashing 80pt large fonts in scrolling frames with all kinds of - otherwise not horrible on purpose - features of CSS), and I just fear what they can do with wysiwig. Such texts are sometimes just easier to completely reformat (reset ALL formatting to default and start over). Foundation's purpose is to make it easier for everyone and to invite and involve everyone, I know. I just have my doubts and worries, which I have just shared. Peter ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Version control
On Sun, Feb 7, 2010 at 00:38, Ævar Arnfjörð Bjarmason ava...@gmail.com wrote: It's interesting that the #1 con against Git in that document is Lots of annoying Git/Linux fanboys. No, it's the screaming 'hell yeah!' but have no idea what they're talking about part. :-) g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google phases out support for IE6
On Tue, Feb 2, 2010 at 00:44, Gregory Maxwell gmaxw...@gmail.com wrote: People are really bad at complaining, especially web users. We've had prolonged obvious glitches which must have effected hundreds of thousands of people and maybe we get a couple of reports. For Average Joe and Jane it usually isn't obvious what to do when something's broken. I've observed people using really broken websites (fallen apart layout, broken menus) and never report but complain to their colleagues. I second that people are bad at reporting problems, and I must add that computer people are usually bad to get the complaints and fix them. ;-) I guess if you have a problem and you know someone who can do something about it, then it'll get fixed, otherwise it _may_ get fixed, one day or other. [I've experienced this latter problem regarding email config bugs and [not] having them fixed.] Nevertheless I wouldn't miss any IE features, but then again I'm an anti m$ fascist by genetics. ;-) g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Google phases out support for IE6
What about creating an monobo'oldies theme for them? I mean, move current stuff to oldies, and drop elders support from monobook. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikitext vs. WYSIWYG (was: Proposal for editing template calls within pages)
On Thu, Sep 24, 2009 at 14:36, David Gerard dger...@gmail.com wrote: However, impenetrable wikitext is one of *the* greatest barriers to new users on Wikimedia projects. And this impenetrability is not, in any way whatsoever or by any twists of logic, a feature. Adding a gui layer to wikitext is always okay, as long as it's possible to get rid of, since majority of edits not coming from new users, and losing flexibility for power users to get more newbies doesn't sound like a good deal to me. At least all of the GUIs I've seen were slow and hard to use, and resulted unwanted (side) effects if something even barely complex were entered. And this isn't the problem of Wikipedia: google docs, which is one of the most advanced web-based gui systems I guess have plenty of usability problems, which only can be fixed by messing with the Source. And many core people want to mess with the source. So, adding a newbie layer is okay as long as you don't mess up the work of the non-newbies. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [WikiEN-l] MediaWiki is getting a new programming language
On Fri, Jul 17, 2009 at 12:06, Gerard Meijssengerard.meijs...@gmail.com wrote: Well this strength is not that great when people like myself who has commit right on SVN does not want to touch templates with a barge pole if I can help it. Wikipedia is supposed to be this thing everybody can edit. I think you misprioritise the whole thing. Consider it a feature, not a base functionality. Most installations do not use 10% of the possible features available, due to lack of knowledge, time, bravery or else. Writing templates with code is a rare art by my observation, most of the larger MW installations never have used it in the first hand. You would like to remove TeX (math) input because the language is complex? And since it'd be an extension I guess, it would imply that you want to forbid(?) creating complex extensions? That's unrealistic. Geeks need this functionality, so ungeeks may or may not care about that, it doesn't really matter. If it's easier to understand than not: that's a plus. By the way I wouldn't touch PHP with a teen feet pole and rubber gloves, but I'm fine with current template syntax. We're individuals with our own preferences. ;-) grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [WikiEN-l] MediaWiki is getting a new programming language
On Wed, Jul 8, 2009 at 10:16, Gerard Meijssengerard.meijs...@gmail.com wrote: The argument that a language should be readable and easy to learn is REALLY relevant and powerful. A language that is only good for geeks is detrimental to the use of MediaWiki. Our current templates and template syntax are horrible. Wikipedia is as a consequence hardly editable by everyone. Mortals _use_ the templates, not _create_ them. Geeks create templates for mortals. Current syntax is indeed horrible, but complete readibility is not the main issue I'd say. Security, speed and flexibility should be, along the ease of implementation. Peter ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] wikimedia, wikipedia and ipv6
On Fri, Jun 12, 2009 at 13:55, Aryeh Gregorsimetrical+wikil...@gmail.com wrote: This might be useful, although most of the info is probably outdated: http://wikitech.wikimedia.org/view/Special:Search?search=ipv6go=Go Yep, including the dead labs link. But it mentioned LVS [ipvs], dunno whether we use it or not, but it supports ipv6 either. ;-) g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Unbreaking statistics
Hello, I see I've created quite a stir around, but so far nothing really useful popped up. :-( But I see that one from Neil: Yes, modifying the http://stats.grok.se/ systems looks like the way to go. For me it doesn't really seem to be, since it seems to be using an extremely dumbed down version of input, which only contains page views and [unreliable] byte counters. Most probably it would require large rewrites, and a magical new data source. What do people actually want to see from the traffic data? Do they want referrers, anonymized user trails, or what? Are you old enough to remember stats.wikipedia.org? As far as I remember originally it ran webalizer, then something else, then nothing. If you check a webalizer stat you'll see what's in it. We are using, or we used until our nice fellow editors broke it, awstats, which basically provides the same with more caching. Most used and useful stats are page views (daily and hourly stats are pretty useful too), referrers, visitor domain and provider stats, os and browser stats, screen resolution stats, bot activity stats, visitor duration and depth, among probably others. At a brief glance I could replicate the grok.se stats easily since it seems to work out of http://dammit.lt/wikistats/, but it's completely useless for anything beyond page hit count. Is there a possibility to write a code which process raw squid data? Who do I have to bribe? :-/ -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l