from:"Michael Dale"

Re: [Wikitech-l] [Multimedia] Audio/video updates: TimedMediaHandler, ogv.js, and mobile

2015-06-11 Thread Michael Dale

Very impressive, amazing progress for a part time project ! 

Thats interesting that iOS supports M-JPEG, had not heard that before. 

Per M-JPGEG in wikipedia app ... They have WKWebView webview in iOS8 and above 
no? so in theory could run the JS engine against the same subset of iOS devices 
at similar performance as a stop gap there. But of course the native player 
would be ideal ;)

—michael


 On Jun 11, 2015, at 7:05 PM, Fabrice Florin fflo...@wikimedia.org wrote:
 
 Nicely done, Brion!
 
 We’re very grateful for all the mobile multimedia work you’ve been doing on 
 your ‘spare time’ …
 
 Much appreciated,
 
 
 Fabrice
 
 
 On Jun 11, 2015, at 7:34 AM, Brion Vibber bvib...@wikimedia.org 
 mailto:bvib...@wikimedia.org wrote:
 
 I've been passing the last few days feverishly working on audio/video stuff, 
 cause it's been driving me nuts that it's not quite in working shape.
 
 TL;DR: Major fixes in the works for Android, Safari (iOS and Mac), and 
 IE/Edge (Windows). Need testers and patch reviewers.
 
 
 == ogv.js for Safari/IE/Edge ==
 
 In recent versions of Safari, Internet Explorer, and Microsoft's upcoming 
 Edge browser, there's still no default Ogg or WebM support but JavaScript 
 has gotten fast enough to run an Ogg Theora/Vorbis decoder with CPU to spare 
 for drawing and outputting sound in real time.
 
 The ogv.js decoder/player has been one of my fun projects for some time, and 
 I think I'm finally happy with my TimedMediaHandler/MwEmbedPlayer 
 integration patch https://gerrit.wikimedia.org/r/#/c/165478/ for the 
 desktop MediaWiki interface.
 
 I'll want to update it to work with Video.js later, but I'd love to get this 
 version reviewed and deployed in the meantime.
 
 Please head over to https://ogvjs-testing.wmflabs.org/ 
 https://ogvjs-testing.wmflabs.org/ in Safari 6.1+ or IE 10+ (or 'Project 
 Spartan' on Windows 10 preview) and try it out! Particularly interested in 
 cases where it doesn't work or messes up.
 
 
 == Non-JavaScript fallback for iOS ==
 
 I've found that Safari on iOS supports QuickTime movies with Motion-JPEG 
 video and mu-law PCM audio https://gerrit.wikimedia.org/r/#/c/217295/. 
 JPEG and PCM are, as it happens, old and not so much patented. \o/
 
 As such this should work as a fallback for basic audio and video on older 
 iPhones and iPads that can't run ogv.js well, or in web views in apps that 
 use Apple's older web embedding APIs where JavaScript is slow (for example, 
 Chrome for iOS).
 
 However these get really bad compression ratios, so to keep bandwidth down 
 similar to the 360p Ogg and WebM versions I had to reduce quality and 
 resolution significantly. Hold an iPhone at arm's length and it's maybe ok, 
 but zoom full-screen on your iPad and you'll hate the giant blurry pixels!
 
 This should also provide a working basic audio/video experience in our 
 Wikipedia iOS app, until such time as we integrate Ogg or WebM decoding 
 natively into the app.
 
 Note that it seems tricky to bulk-run new transcodes on old files with 
 TimedMediaHandler. I assume there's a convenient way to do it that I just 
 haven't found in the extension maint scripts...
 
 
 == In progress: mobile video fixes ==
 
 Audio has worked on Android for a while -- the .ogg files show up in native 
 audio elements and Just Work.
 
 But video has been often broken, with TimedMediaHandler's popup transforms 
 reducing most video embeds into a thumbnail and a link to the original file 
 -- which might play if WebM (not if Ogg Theora) but it might also be a 1080p 
 original which you don't want to pull down on 3G! And neither audio nor 
 video has worked on iOS.
 
 This patch https://gerrit.wikimedia.org/r/#/c/217485/ adds a simple mobile 
 target for TMH, which fixes the popup transforms to look better and actually 
 work by loading up an embedded-size player with the appropriately playable 
 transcodes (WebM, Ogg, and the MJPEG last-ditch fallback).
 
 ogv.js is used if available and necessary, for instance in iOS Safari when 
 the CPU is fast enough. (Known to work only on 64-bit models.)
 
 
 == Future: codec.js and WebM and OGVKit ==
 
 For the future, I'm also working on extending ogv.js to support WebM 
 https://brionv.com/log/2015/06/07/im-in-ur-javascript-decoding-ur-webm/ 
 for better quality (especially in high-motion scenes) -- once that 
 stabilizes I'll rename the combined package codec.js. Performance of WebM is 
 not yet good enough to deploy, and some features like seeking are still 
 missing, but breaking out the codec modules means I can develop the codecs 
 in parallel and keep the high-level player logic in common.
 
 Browser infrastructure improvements like SIMD, threading, and more GPU 
 access should continue to make WebM decoding faster in the future as well.
  
 
 I'd also like to finish up my OGVKit package 
 https://github.com/brion/OGVKit for iOS, so we can embed a basic 
 audio/video player at full quality into the Wikipedia iOS app. This needs 
 some

Re: [Wikitech-l] [Multimedia] What to do with TimedMediaHandler

2014-12-11 Thread Michael Dale

This is a fair assessment of the challenges / divergent code bases. 

In terms of a path forward, I think it’s worth highlighting how the Kaltura 
player normally integrates with other stand alone entity providers now days. We 
normally integrate via a media proxy library that basically normalizes 
representation of stream sources, media identifiers, structured and 
unstructured metadata, captions, cuePoints, content security protocols to a 
“Kaltura like” representation then the CMS’s consume the kaltura player iframe 
services in a way that is almost identical to our Kaltura Platform style 
embeds, just overriding identifiers. This makes the iframe player service easy 
to be consumed by our native components, twitter embeds etc. See architecture 
overview here. [1] 

You can see what this looks like with a mediaProxy override sample [2]

Is this useful for wikimedia use case? … not so sure ... since the review scope 
would grow a lot if we had the player serving its own iframe independently of 
the rest of code infrastructure it would otherwise duplicate many components, 
and reduce insensitive to align versions of things. 

Some significant brainstorming and alignment would need to take place which has 
awaited a focus from the multimedia team; since we would want to focus efforts 
towards something that would be sustainable for both organizations going 
forward both from community and organization contributions so the projects 
could better benefit each other.

* Kaltura will move quickly to review and integrate the code styling / js-hint 
updates something we have intended to do for a while. Other low hanging fruit 
alignments have already been integrated, by some early work on this by 
paladox2015 ( github id ).
* Kaltura would be interested in working to make things as easy as possible to 
use the library in both context; but we need “a plan”. While things have 
drifted significantly there is paths towards upgrading things, a goal to align 
code conventions [3] means the projects share a lot more then say some 
arbitrary other project out there that would do everything its own way ;)

* That being said, the possibility for WMF to use something new should 
evaluated, but again should involve multimedia team within WMF so that the cost 
benefit analysis can be mapped out per organization infrastructure support; or 
a similar situation will crop up after a sprint of effort produced something 
usable, but was not maintained going forward. 

[1] 
http://knowledge.kaltura.com/sites/default/files/styles/large/public/kaltura-player-toolkit.png
 
http://knowledge.kaltura.com/sites/default/files/styles/large/public/kaltura-player-toolkit.png
[2] 
http://kgit.html5video.org/pulls/1194/modules/KalturaSupport/tests/StandAlonePlayerMediaProxyOverride.html
 
http://kgit.html5video.org/pulls/1194/modules/KalturaSupport/tests/StandAlonePlayerMediaProxyOverride.html
3] https://github.com/kaltura/mwEmbed/#hacking-on-mwembed 
https://github.com/kaltura/mwEmbed/#hacking-on-mwembed



 On Dec 11, 2014, at 7:55 AM, Derk-Jan Hartman d.j.hartman+wmf...@gmail.com 
 wrote:
 
 So for a while now, I have been toying a bit with 
 TimedMediaHandler/MwEmbed/TimedText, with the long term goal of wanting it to 
 be compatible with VE, live preview, flow etc.
 
 There is a significant challenge here, that we are sort of conveniently 
 ignoring because stuff 'mostly works' currently and the MM team having their 
 plate full with plenty of other stuff:
 
 1: There are many patches in our modules that have not been merged upstream
 2: There are many patches upstream that were not merged in our tree
 3: Upstream re-uses RL and much infrastructure of MW, but is also 
 significantly behind. They still use php i18n, and their RL classes 
 themselves are also out of date (1.19 style ?). This makes it difficult to 
 get 'our' changes merged upstream, because we need to bring any RL changes 
 etc with it as well.
 4: No linting and code style checks are in place, making it difficult to 
 assess and maintain quality.
 5: Old jQuery version used upstream
 6: Lot's of what we consider deprecated methodologies are still used upstream.
 7: Upstream has a new skin ??
 8: It uses loader scripts on every page, which really aren't necessary 
 anymore now that we can add modules to ParserOutput, but since I don't fully 
 understand upstream, i'm not sure what is needed to not break upstream in 
 this regard.
 9: The JS modules arbitrarily add stuff to the mw. variables, no namespacing 
 there.
 10: The RL modules are badly defined, overlap each other and some script 
 files contain what should be in separate modules
 11: We have 5 'mwembed' modules, but upstream has about 20, so they have 
 quite a bit more code to maintain and migrate.
 12: Brion is working on his ogvjs player which at some point needs to 
 integrate with this as well (Brion already has some patches for this [1]).
 13: Kaltura itself seems very busy and doesn't seem to have too

Re: [Wikitech-l] [Multimedia] ogv.js - JavaScript video decoding proof of concept

2014-02-23 Thread Michael Dale

Amazing work. Added bug to integrate into TMH player.  
https://bugzilla.wikimedia.org/show_bug.cgi?id=61823

I can’t imagine anyone being against flash to deliver free formats!  

—michael

On Feb 23, 2014, at 5:45 PM, Brion Vibber bvib...@wikimedia.org wrote:

 In case anybody's interested but not on wikitech-l; looking for some feedback 
 on possible directions for fallback in-browser video players.
 
 -- brion
 
 -- Forwarded message --
 From: Brion Vibber bvib...@wikimedia.org
 Date: Sun, Feb 23, 2014 at 6:43 AM
 Subject: Re: ogv.js - JavaScript video decoding proof of concept
 To: Wikimedia-tech list wikitech-l@lists.wikimedia.org
 
 
 Just an update on this weekend project, see the current demo in your 
 browser[1] or watch a video of Theora video playing on an iPhone 5s![2]
 
 [1] https://brionv.com/misc/ogv.js/demo/
 [2] http://www.youtube.com/watch?v=U_qSfHPhGcA
 
 * Got some fixes and testing from one of the old Cortado maintainers -- 
 thanks Maik!
 * Audio/video sync is still flaky, but everything pretty much decodes and 
 plays properly now.
 * IE 10/11 work, using a Flash shim for audio.
 * OS X Safari 6.1+ works, including native audio.
 * iOS 7 Safari works, including native audio.
 
 Audio-only files run great on iOS 7 devices. The 160p video transcodes we 
 experimentally enabled recently run *great* on a shiny 64-bit iPhone 5s, but 
 are still slightly too slow on older models.
 
 
 The Flash audio shim for IE is a very simple ActionScript3 program which 
 accepts audio samples from the host page and outputs them -- no proprietary 
 or patented codecs are in use. It builds to a .swf with the open-source 
 Apache Flex SDK, so no proprietary software is needed to create or update it.
 
 I'm also doing some preliminary research on a fully Flash version, using the 
 Crossbridge compiler[3] for the C codec libraries. Assuming it performs about 
 as well as the JS does on modern browsers, this should give us a fallback for 
 old versions of IE to supplement or replace the Cortado Java player... Before 
 I go too far down that rabbit hole though I'd like to get peoples' opinions 
 on using Flash fallbacks to serve browsers with open formats.
 
 As long as the scripts are open source and we're building them with an open 
 source toolchain, and the entire purpose is to be a shim for missing browser 
 feature support, does anyone have an objection?
 
 [3] https://github.com/adobe-flash/crossbridge
 
 -- brion
 
 
 On Mon, Oct 7, 2013 at 9:01 AM, Brion Vibber bvib...@wikimedia.org wrote:
 TL;DR SUMMARY: check out this short, silent, black  white video: 
 https://brionv.com/misc/ogv.js/demo/ -- anybody interested in a side project 
 on in-browser audio/video decoding fallback?
 
 
 One of my pet peeves is that we don't have audio/video playback on many 
 systems, including default Windows and Mac desktops and non-Android mobile 
 devices, which don't ship with Theora or WebM video decoding.
 
 The technically simplest way to handle this is to transcode videos into H.264 
 (.mp4 files) which is well supported by the troublesome browsers. 
 Unfortunately there are concerns about the patent licensing, which has held 
 us up from deploying any H.264 output options though all the software is 
 ready to go...
 
 While I still hope we'll get that resolved eventually, there is an 
 alternative -- client-side software decoding.
 
 
 We have used the 'Cortado' Java applet to do fallback software decoding in 
 the browser for a few years, but Java applets are aggressively being 
 deprecated on today's web:
 
 * no Java applets at all on major mobile browsers
 * Java usually requires a manual install on desktop
 * Java applets disabled by default for security on major desktop browsers
 
 Luckily, JavaScript engines have gotten *really fast* in the last few years, 
 and performance is getting well in line with what Java applets can do.
 
 
 As an experiment, I've built Xiph's ogg, vorbis, and theora C libraries 
 cross-compiled to JavaScript using emscripten and written a wrapper that 
 decodes Theora video from an .ogv stream and draws the frames into a canvas 
 element:
 
 * demo: https://brionv.com/misc/ogv.js/demo/
 * code: https://github.com/brion/ogv.js
 * blog  some details: 
 https://brionv.com/log/2013/10/06/ogv-js-proof-of-concept/
 
 It's just a proof of concept -- the colorspace conversion is incomplete so 
 it's grayscale, there's no audio or proper framerate sync, and it doesn't 
 really stream data properly. But I'm pleased it works so far! (Currently it 
 breaks in IE, but I think I can fix that at least for 10/11, possibly for 9. 
 Probably not for 6/7/8.)
 
 Performance on iOS devices isn't great, but is better with lower resolution 
 files :) On desktop it's screaming fast for moderate resolutions, and could 
 probably supplement or replace Cortado with further development.
 
 Is anyone interested in helping out or picking up the project to move it 
 towards proper playback?

Re: [Wikitech-l] showing videos and images in modal viewers within articles

2013-06-03 Thread Michael Dale


On 05/30/2013 06:28 PM, Ryan Kaldari wrote:
OK, I decided to be slightly bold. I changed the modal video threshold 
on en.wiki from 200px to 800px. This means all video thumbnails that 
are 800px or smaller will open a modal player when you click on the 
thumbnail. If there are no complaints from people, we can switch the 
modal behavior to just be the default everywhere. Try it out and let 
me know what you think:
https://en.wikipedia.org/wiki/Congenital_insensitivity_to_pain#Presentation 



Ryan Kaldari


I would lean towards more like  400 px. There are probably pages that 
have large videos already, maybe don't need to be re-modal-ized ?


I agree with Erik we should autoplay after you click on the play 
button on a modal. https://gerrit.wikimedia.org/r/66551


Note in IOS, modal popups require an additional click to play if loading 
anything asyncronusly. We have done work in the kaltura library to be 
smart capturing the click gesture in thumbnail embeds [1]. In mediaWiki 
we may need to do something similar if we async load the player library.

http://player.kaltura.com/docs/thumb

But the extra click is the least of our iOS video issues, for the time 
being :(


--michael


http://player.kaltura.com/docs/thumb

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wav support to Commons (was Re:Advice Needed)

2013-04-11 Thread Michael Dale


On 04/11/2013 10:48 AM, Quim Gil wrote:
I'm just trying to be consistent: a GSOC project can't force the 
agenda of a Wikimedia project.


Also conservative when it comes to manage GSOC students expectations. 
These bug reports have been open for years, and I don't want to 
guarantee to a GSOC student that they can count on seeing them fixed now.


Bug 20252 - Support for WAV and AIFF by converting files to FLAC 
automatically

https://bugzilla.wikimedia.org/show_bug.cgi?id=20252

Bug 32135 - WAV audio support via TimedMediaHandler
https://bugzilla.wikimedia.org/show_bug.cgi?id=32135


Adding Wav to TMH is a pretty small technical addition. Audio 
transcoding was already added by Jan. Adding .wav support on top of 
that, is probably one of the easiest parts of the project.


I don't think the project would be forcing an agenda on commons, its 
analogous work to add TIFF support a while back. Also this is mostly an 
intermediate solution while browsers can only capture and upload PCM wav 
data.  Once browsers ship the full record api, we will be able to 
'export out' the captured Opus audio and upload that. Then transcode 
from that Opus oga to additional formats that can be played in ( other ) 
browsers and devices.


--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Audio derivatives, turning on MP3/AAC mobile app feature request.

2013-02-04 Thread Michael Dale

Yes' all that changed is we added support for audio derivatives. We have
not enabled mp3 or AAC. The same code can be used for flac - ogg or
whatever we configure.
On Feb 3, 2013 2:33 AM, Yuvi Panda yuvipa...@gmail.com wrote:

 Just to be sure that I'm reading this right - nothing actually changed yet.
 We still are a free-formats-only shop for A/V. Right?

 --
 Yuvi Panda T
 http://yuvi.in/blog
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Audio derivatives, turning on MP3/AAC mobile app feature request.

2013-02-02 Thread Michael Dale

+correct content-type this time ;) Note this has already been merged, 
but still worth mention for visibility.


On 2/1/13 12:10 PM, Michael Dale wrote:
We are about to merge in support for audio derivatives to Timed Media 
Handler (TMH). The big value here, I think is encoding to AAC or MP3 
and adding a /listen to this article/ feature to the mobile app.

https://gerrit.wikimedia.org/r/#/c/39363/

This can really help with improving accessibility of Wiktionary 
pronunciation media files as well.


Also AAC / m4v ingestion, could make audio recordings a lot easier to 
import into the site, i.e a record a reading of this article mobile 
app feature #2  ;)


There are already thousands of spoken articles, with some promotion 
their could probably be a lot be more:

http://en.wikipedia.org/wiki/Category:Spoken_articles

The software patent situation for mp3 is sad, considering how long the 
mp3 format has been around:

http://www.tunequest.org/a-big-list-of-mp3-patents/20070226/

I think AAC is a similar situation, encoder wise:
http://en.wikipedia.org/wiki/Advanced_Audio_Coding#Licensing_and_patents

But fundamentally Wikimedia is not distributing these encoders and 
there are no royalties for media distribution. Likewise we are not 
shipping decoders ( the decoders are in browser or the mobile OS )


I don't know why Wikimedia's commitment to being accessible in royalty 
free formats, somehow also precludes making content accessible for 
folks on platforms that ~don't~ decode royalty free formats. But 
hopefully we can change that over time.


Not sure if this is the right forum for this, but I hope we could come 
out of this thread with rough consensus to enable these formats to 
help increase the reach of audio works.


peace,
--michael



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Video on mobile: Firefox works, way is paved for more browser support

2012-12-13 Thread Michael Dale


On 12/13/2012 12:38 PM, Brion Vibber wrote:

It's much, MUCH easier for us to flip the H.264 switch... there are
ideological reasons we might not want to, but we're going to have to put
the effort into making those player apps if we want all our data accessible
to everyone.


+1 its non trivial amount of effort to integrated native players across 
at least 3 major platforms, ( iOS, Android, Win8 ), And as pointed out 
in the thread, low power android / firefox OS devices include h.264 
hardware decoders but will fail for medium resolution webm.


I think Wikimedia mobile product needs to come up with some 
recommendations for the Board / community to evaluate. There are trade 
offs in effort and resource allocation.


Is integrating software video decoders with native apps the best use of 
resources? or are there other higher priority efforts? Or more 
realistically, the ideological hard line, means kicking the proverbial 
video on Wikipedia bucket further down stream, which is also a trade off 
of sorts.


--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Video on mobile: Firefox works, way is paved for more browser support

2012-12-13 Thread Michael Dale


On 12/13/2012 04:56 PM, Brion Vibber wrote:

On Thu, Dec 13, 2012 at 10:38 AM, Brion Vibber bvib...@wikimedia.orgwrote:


On Wed, Dec 12, 2012 at 2:50 PM, Rob Lanphier ro...@wikimedia.org wrote:


I was able to play the WebM file of the locomotive on the front page
of https://commons.wikimedia.org just now on my Nexus 7 using Chrome,
so at least on very new stock Android devices, all is well.  My much
older Galaxy S didn't fare so well, though, so I would be willing to
believe that Android devices with proper WebM support are still
relatively rare.  That said, the replacement rate for this hardware is
frequent enough that it won't be long before my Nexus 7 is much
older.


I can play the current media on the front page of Commons in Chrome on my
Nexus 7, but it won't play in position on either desktop 
http://en.wikipedia.org/wiki/Serge_Haroche or mobile 
http://en.m.wikipedia.org/wiki/Serge_Haroche ...

Sigh. :)


I think this relates to the page not being purged after the transcodes 
are updated. If you purge the page, will probably give the nexus a more 
playable flavour.

http://en.wikipedia.org/wiki/Serge_Haroche should work on your nexus now ;)

TMH should add page purge to the job queue, but not sure why that page 
had not been purged yet.




Still some work to be done on compatibility...

I also notice that the source elements in the video seem to start with
the original, and aren't labeled with types or codecs. This means that
without the extra Kaltura player JS -- for instance as we see it on the
mobile site right now -- the browser may not be able to determine which
file is playable or best-playable.


For correctness we should include type. But I don't know if that will 
help, the situation you describe.

https://gerrit.wikimedia.org/r/#/c/38665/

But certainly will help in the other ways you outline in the bug 43101

AFAIK there are no standard source tag attributes to represent device 
specific playback targets ( other than type ), so we set a few in data-* 
tags and read them within the kaltura html5 lib to do flavour selection.


We of course use the Kaltura HTML5 lib on lots of mobile devices, so if 
you want to explore usage in the mobile app happy to support. For 
example including the payload into the application itself ( so its not a 
page view time )


peace,
--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Video on mobile: Firefox works, way is paved for more browser support

2012-12-12 Thread Michael Dale

As Brion points out, we get much better coverage. I enabled h.264
locally and ran though a set of Android , iOS and desktop browsers I had
available at the time:

http://www.mediawiki.org/wiki/Extension:TimedMediaHandler/Platform_testing

Pro h.264:
* No one is proposing turning off webm, an ideological commitment to
support free access with free platforms in royalty free formats, does
not necessarily require you exclude derivation to proprietary formats.

* We already are not ideologically pure
** We submit to the apple store terms of service, we build outputs
with non-freedom iOS tool chain etc.
** We write custom code / work arounds to support proprietary non
web-standard browsers.
* There is little to no chance of Apple adding googles codec support
to their platform.
* We could ingest h.264 making letting the commons store source material
in its originally source captured format. This is important for years
down the road we have the highest quality possible.
* Chicken and egg, for companies like apple to care about wikimedia webm
only support, wikimedia would need lots of video, as long as we don't
support h.264 our platform discourages wide use video on articles.

Pro Webm:
* Royalty free purity in /most/ of what wikimedia distributes.
* We could in theory add software playback of webm to our iOS and
android app.

* Reduced storage costs ( marginal, vs public good of access )
* Reduced licence costs for an h.264 encoder on our two transcoding
boxes ( very marginal )
* Risk that mpeg-la adds distribution costs for free online distribution
in the future. Low risk, and we could always turn it off

--michael

On 12/12/2012 11:26 AM, Luke Welling wrote:

FirefoxOS/Boot2Gecko phones presumably also support Ogg Theora
and WebM formats, but they're not really a market share yet and may never
be in the developed world.

Without trying to downplay the importance of ideological purity, keep in
mind that Mozilla, who have largely the same ideology on the matter have
conceded defeat on the practical side of it after investing significant
effort.

Eg http://appleinsider.
com/articles/12/03/14/mozilla_considers_h264_video_support_after_googles_vp8_fails_to_gain_traction

With Google unwilling to commit the battle was winnable.

There is not an ideologically pure answer that is compatible with the goal
of taking video content and disseminating it effectively and globally. The
conversation needs to be framed as what shade of grey is an acceptable
compromise.

Luke Welling

On Wed, Dec 12, 2012 at 6:44 AM, Antoine Musso hashar+...@free.fr wrote:

Le 12/12/12 00:15, Erik Moeller a écrit :

Since there are multiple potential paths for changing the policy
(keeping things ideologically pure, allowing conversion on ingestion,
allowing h.264 but only for mobile, allowing h.264 for all devices,
etc.), and since these issues are pretty contentious, it seems like a
good candidate for an RFC which'll help determine if there's an
obvious consensus path forward.

Could we host h.264 videos and related transcoders in a country that
does not recognize software patents?

Hints:
- I am not a lawyer
- WMF has server in Netherlands, EU.

--
Antoine hashar Musso

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Switching to Timed Media Handler on Commons

2012-10-29 Thread Michael Dale


On 10/28/12 11:41 PM, Rob Lanphier wrote:

Hi everyone,

Assuming we get the last blocking bugs fixed tomorrow, then we should
be able to go onto Commons on Wednesday, so that's our current plan.
Let us know if there are issues with this.

Thanks!
Rob



Thanks for the update Rob.

I did not see details on the configuration phases in this deployment ?

On test2 the configuration supports uploading webm. Is the plan to 
enable webm on commons prior to other wikis supporting the embedding of 
webm files?


I guess this would not be completely disastrous; it should fail over to 
links back to commons for unknown media types?


Also we have not yet connected test2 to the video-scalars / transcoding, 
so imagine we will want to test that in the next few days? And we should 
know ahead of time if the derivatives plan to be enabled on commons as 
well. Again the other wikis won’t be able to embed the derivatives until 
TMH is in use on them as well.


I know we conducted tests for TMH “playing” well with oggHandler 
provider ( i.e the test2.wikipedia.org pages are embedding commons 
videos but played back with TMH ) ... I am not sure if we have conducted 
tests for oggHandler playing well with a TMH provider with the possible 
configuration phases mentioned above.


Happy to see progress, I will try and be available if anything comes up.

--michael



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] So what's up with video ingestion?

2012-06-19 Thread Michael Dale


On 06/18/2012 04:52 PM, Brion Vibber wrote:

On Mon, Jun 18, 2012 at 4:44 PM, David Gerard dger...@gmail.com wrote:


On 19 June 2012 00:30, Brion Vibber br...@pobox.com wrote:


warning: patent politics question may lead to offtopic bikeshedding
Additionally there's the question of adding H.264 transcode *output*,

which

would let us serve video to mobile devices and to Safari and IE 9 without
any custom codec or Java installations. As far as I know that's not a

huge

technical difficulty but still needs to be decided politically, either
before or after an initial rollout of TMH.
/warning


It's entirely unclear to me that this is intrinsically related.


They're intrinsically related because one depends on the other to be
possible. This is a one-way dependency: H.264 output depends on
TimedMediaHandler support. TimedMediaHandler in general doesn't depend on
H.264 output, and should not be confused with it.

-- brion


I think what we should do here is go ahead and add support for ingestion 
and output and then we can just adjust the settings file based on what 
we /decide politically/ do going forward.


Since both the deployment review pipeline as well as the political 
decision pipeline can be ~quite long~ probably best to have it all 
supported so we can just adjust a configuration file once we decide one 
way or another.


--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] What's the best place to do post-upload processing on a file? Etc.

2012-05-04 Thread Michael Dale

You will want to put into a jobQueue you can take a look at the Timed 
Media Handler extension for how post upload processor intensive 
transformations can be handled.


--michael

On 05/04/2012 04:58 AM, emw wrote:

Hi all,

For a MediaWiki extension I'm working on (see
http://lists.wikimedia.org/pipermail/wikitech-l/2012-April/060254.html), an
effectively plain-text file will need to be converted into a static image.
I've got a set of scripts that does that, but it takes my medium-grade
consumer laptop about 30 seconds to convert the plain-text file into a
ray-traced static image.  Since ray-tracing the images being created here
substantially improves their visual quality, my impression is that it's
worth a moderately expensive transformation operation like this, but only
if the operation is done once.

Given that, I assume it'd be best to do this transformation immediately
after the plain-text file has completed uploading.  Is that right?  If not,
what's a better time/way to do that processing?

I've looked into MediaWiki's 'UploadComplete' event hook to accomplish
this. That handler gives a way to access information about the upload and
the local file.  However, I haven't been able to find a way to get the
uploaded file's path on the local file system, which I would need to do the
transformation.  Looking around related files I see references to $srcPath,
which seems like what I would need.  Am I just missing some getter method
for file system path data in UploadBase.php or LocalFile.php?  How can I
get the information about an uploaded file's location on the file system
while in an onUploadComplete-like object method in my extension?

Thanks,
Eric
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Chunked uploading API documentation; help wanted

2012-04-17 Thread Michael Dale

Thanks Brion ( and Erik ), for brining chunk uploading closer to 
fruition. +Jan, Can you help out with documenting the api? I will take a 
pass at it as well when I get a chance ;)


--michael

On 04/17/2012 03:37 PM, Brion Vibber wrote:

I've started adding some documentation on chunked uploading via the API on
mediawiki.org:

https://www.mediawiki.org/wiki/API:Upload#Chunked_uploading

This info is based on watching UploadWizard at work, and may be incomplete
or misleading. :) So please feel free to hop in and help clean it up,
thanks! There also probably needs to be better information about stashed
uploads, which has some intersection with chunked (for instance -- is it
possible to do chunked upload without using the stash? Or are they required
together?)

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Video codecs and mobile

2012-03-20 Thread Michael Dale


On 03/20/2012 03:15 AM, David Gerard wrote:

We should definitely be able to ingest H.264. (This has been on the
wishlist forever and is a much harder problem than it sounds.)


Once TMH is deployed,  practically speaking .. upload to youtube - 
import to commons .. will probably be the easiest path for a while. 
Especially given the tight integration youtube has with every phone, and 
any capture device with web.


But yes the feature should be developed, and it is more difficult then 
it sounds when you want to carefully consider things like making the 
source file available.


--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Video codecs and mobile

2012-03-19 Thread Michael Dale


On 03/19/2012 06:24 PM, Brion Vibber wrote:

In theory we can produce a configuration with TimedMediaHandler to produce
both H.264 and Theora/WebM transcodes, bringing Commons media to life for
mobile users and Apple and Microsoft browser users.

What do we think about this? What are the pros and cons?

-- brion



The point about mobile is very true and its very very difficult to 
debase entrenched formats, especially when its tied up in hardware 
support.  And of course the Kaltura HTML5 library used in TMH has a lot 
of iPad and Android H.264 support code in there for all the commercial 
usage of the library, so it would not be a technical challenge to 
support it.


But I think we should get our existing TMH out the door exclusively 
supporting WebM and Ogg. We and can revisit adding support for other 
formats after that. High on that list is also mp3 support which would 
have similar benefits for audio versions of articles and mobile hardware 
support audio playback.


If people felt it was important, By the end of the year we could have 
javascript based webm decoders for supporting WebM in IE10 ( in case 
people never saw this: https://github.com/bemasc/Broadway ) But of 
course this could be seen as insert your favourite misguided good 
efforts analogy here. i.e maybe efforts are better focused on tools 
streamlining video contribution process.


Maybe we focus on a way to upload h.264 videos from mobile. Of course 
doing mobile h.264 uploads correctly would ideally include making source 
content available, for maximising re-usability of content, without the 
quality loss in multiple encoding passes, so in effect running up 
against the very principal that governs the Wikimedia projects to make 
content a freely reusable resources.


I think Mozilla adding /desktop/ h.264 support may hurt free formats. On 
desktop they already have strong market share, and right now many 
companies actually request including WebM in their encoding profiles ( 
on kaltura )  but that of course would not be true if the Mozilla 
supports h.264 on desktop, and it would make it harder for google chrome 
to follow through on their promise to only support WebM ( if they still 
plan on doing that ).


For mobile it makes sense, Mozilla has no market share there and they 
have to be attractive to device manufactures create a solid mobile user 
experience, fit within device battery life expectations etc. And on 
mobile there is no fall back to flash if the site can't afford to encode 
all their content into free formats and multiple h.264 profiles. And 
they can't afford that on a that browser / platform that people have to 
generality /choose /to install and use.


If they support h.264 on desktop it will be a big set back for free 
formats, because there won't be any incentive for the vast majority of 
pragmatic sites to support webm.


--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Image uploading social workflow

2011-12-18 Thread Michael Dale

There was the Add Media Wizard project from a while back ( that sounds 
similar to what you describe )
http://www.mediawiki.org/wiki/Extension:Add_Media_Wizard

I wanted to take a look at integrating upload wizard into it post TMH 
deployment, and or something new could be built as a gadget as well.

--michael


On 12/18/2011 07:22 AM, Gregor Hagedorn wrote:
 The improvements to the Upload Wizard are very welcome, but socially,
 I think it is still broken. Please correct me if I overlook something
 or overlook another extension.

 Socially, I believe many mediawiki extensions need a way to ask for
 images on a topic page, provide an upload wizard AND display the
 results on the topic page. Presently, even if the image is added to
 the wiki or a commons repository, it simply disappears in a black hole
 from the perspective of the contributing image author.

 I believe it is possible to have a wizard option which does the following:
 * store the page context from which it was called.
 * upload images to local wiki or a repository
 * open the page context in edit mode
 * search for some form of new-images-section
 ** a possible implementation of this could be a div with id=newimages
 containing a gallery tag
 * if new-images-section exists: add images, if not create with new images.
 * Save context page.

 Presently, WMF is possibly the biggest driver of open content (CC
 BY/CC BY-SA) but is able to collect images only from the small
 population that is the intersection of the population or people able
 to edit mediawiki and the huge population able to provide quality
 images.

 The new-images-section solution would probably not directly work for
 wikipedia itself; here some more complex review mechanism (new images
 gallery would be shown only to some users, including image uploader,
 or so) would be needed, perhaps in combination with flagged rev. I
 view this feature however potentially as a two step process: implement
 with direct addition to page, modify to optimize for flagged revs.

 However, I think something the described feature would be needed;
 presently all these crowdsourcing images are mostly collected by
 projects that either use no open content license at all, or the NC
 license at best. WMF is not able to exert its potential pull towards
 open content in this area.

 Also reported as
 https://bugzilla.wikimedia.org/show_bug.cgi?id=33234

 Gregor

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minor API change, ApiUpload

2011-07-13 Thread Michael Dale

In terms of a db schema friendly to chunks, we would probably want
another table and associated it with a stashed file.

Russ was discussing adding support for appending files within the swft
object store application logic, so we may not have to be concerned with
storing chunk references in the db?

Another potential usage for a media stash is transcoding non-free
formats. Here we could use the media stash as temporary location to put
the media files while we transcode them. Once transcoded we could then
move them into the published space. But I would not be too worried about
incorporating that into the the DB schema until we get to implementation.

--michael

On 07/13/2011 12:30 AM, Bryan Tong Minh wrote:
 Great work people! I've really been waiting for this, and I'm glad
 that it has been finally implemented.

 A remark about extensibility: in the future we might want to use the
 upload stash for more advanced features like asynchronous uploads and
 chunked uploads. I think the database schema should already be
 prepared for this, even if we're not using it. For this purpose I
 would at least add us_status. Perhaps Michael has some ideas what such
 a database schema should further incorporate.


 Cheers,
 Bryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] thumbnail generation for Extension:OggHandler

2011-07-11 Thread Michael Dale

Thanks for this thread, Please do commit fixes for 1.17, If not obvious
already I have really only been targeting trunk. While the extension has
been around since before 1.16, it would be very complicated to restore
the custom resource loader it was using before there was a resource
loader in core.

In terms of Superpurge, that is essentially what we do with the
transcode status table on the image page that lets users with the
given permission purge the transcodes. We did not want to make it too
too easy to purge transcodes cuz once purged a video could be 
inaccessible for devices / browsers that only had webm or only had ogg
support, until the file was retranscoded.

--michael

On 07/11/2011 04:48 AM, Dmitriy Sintsov wrote:
 Yes, locally patched both issues, now runs fine.
 $wgExcludeFromThumbnailPurge is not defined in 1.17, made a check. BTW, 
 it is a bit evil to exclude some extensions from Purge. Maybe there 
 should be another action Superpurge?

 Linker::link() was called statically in TranscodeStatusTable class. 
 Created new Linker() and now it works. I haven't committed into svn, 
 don't know if anyone cares of backcompatibility.

 BTW, if I'd was an leading developer (who makes decisions), I'd probably 
 make MW 1.16 an LTS (long time support) version for legacy setups and 
 extensions.. Though maybe that is unneeded (too much of burden for 
 non-profit organization).

 Dmitriy

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] thumbnail generation for Extension:OggHandler

2011-07-08 Thread Michael Dale

I recommend using the static binaries hosted on firefogg or if you want
to compile it your self using the build tools provided there:
http://firefogg.org/nightly/

Also I would suggest you take a look at TimedMediahandler as an
alternative to oggHandler it has a lot more features such as WebM, timed
text, and transcoding support.
http://www.mediawiki.org/wiki/Extension:TimedMediaHandler

A live install is on prototype if you want to play around with it:
http://prototype.wikimedia.org/tmh/

If you run into any issue, please report them on the bug tracker or
directly to me.
https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensionscomponent=TimedMediaHandler

peace,
--michael

On 07/08/2011 04:18 AM, Dmitriy Sintsov wrote:
 Hi!

 What's the proper way of thumbnail generation for Ogg media handler, so 
 it will work like at commons?

 First, I've downloaded and compiled latest ffmpeg version (from 
 git://git.videolan.org/ffmpeg.git) using the following configure 
 options:

 ./configure --prefix=/usr --disable-ffserver --disable-encoder=vorbis 
 --enable-libvorbis

 The prefix is usual for CentOS layout (which I have at hosting) and best 
 options for vorbis were suggested in this article:
 http://xiphmont.livejournal.com/51160.html

 I've downloaded Apollo_15_launch.ogg from commons then uploaded to my 
 wiki to check Ogg handler. The file was uploaded fine, however the 
 thumbnail is broken - there are few squares at gray field displayed 
 instead of rocket still image.

 In Extension:OggHandler folder I found ffmpeg-bugfix.diff. However there 
 is no libavformat/ogg2.c in current version of ffmpeg. Even, I found the 
 function ogg_get_length () in another source file, however the code was 
 changed and I am not sure that manual comparsion and applying is right 
 way. It seems that the patch is suitable for ffmpeg version developed 
 back in 2007 but I was unable to find original sources to successfully 
 apply the patch.

 I was unable to find ffmpeg in Wikimedia svn repository. Is it there?

 Then, I've tried svn co 
 https://oggvideotools.svn.sourceforge.net/svnroot/oggvideotools 
 oggvideotools
 but I am upable to compile neither trunk nor branches/dev/timstarling 
 version, it bails out with the following error:

 -- ERROR: Theora encoder library NOT found
 -- ERROR: Theora decoder library NOT found
 -- ERROR: Vorbis library NOT found
 -- ERROR: Vorbis encoder library NOT found
 -- ogg library found
 -- GD library and header found
 CMake Error at CMakeLists.txt:113 (MESSAGE):

 I have the following packages installed:
 libvorbis-1.1.2-3.el5_4.4
 libvorbis-devel-1.1.2-3.el5_4.4
 libogg-1.1.3-3.el5
 libogg-devel-1.1.3-3.el5
 libtheora-devel-1.0alpha7-1
 libtheora-1.0alpha7-1

 ffmpeg compiles just fine (with yasm from alternate repo, of course).

 But there is no libtheoradec, libtheoraenc, libvorbisenc neither in main 
 CentOS repository nor in aliernative 
 http://apt.sw.be/redhat/el5/en/i386/rpmforge/RPMS/

 However it seems these is libtheoraenc.c in ffmpeg; what is the best 
 source of these libraries? It seems that there is no chance to find 
 proper rpm's for CentOS and one need to compile these from sources?

 Dmitriy

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader JavaScript validation on trunk (bug 28626)

2011-07-06 Thread Michael Dale

On 07/06/2011 03:04 PM, Brion Vibber wrote:
 Some of you may have found that ResourceLoader's bundled  minified
 JavaScript loads can be a bit frustrating when syntax errors creep into your
 JavaScript code -- not only are the line numbers reported in your browser of
 limited help, but a broken file can cause *all* JS modules loaded in the
 same request to fail[1]. This can manifest as for instance a jquery-using
 Gadget breaking the initial load of jquery itself because it gets bundled
 together into the same request.

Long term I wonder if we should not be looking at closure compiler [1],
we could gain an additional 10% or so compression with simple
optimisations, and it has tools for inspecting compiled output [2]

Long term we could work toward making code compatible with advanced
optimisations, as a side effect we could get improved jsDoc docs and
even better compression and optimisations would be possible.

[1] http://code.google.com/closure/compiler/
[2] http://code.google.com/closure/compiler/docs/inspector.html

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] background shell process control and exit status

2011-06-14 Thread Michael Dale

For the TimedMediaHandler I was adding more fine grain control over
background processes [1] and ran into a unix issue around getting both a
pid and exit status for a given background shell command.

Essentially with a background task I can get the pid or the exit status
but can't seem to get both:

to get the pid:
$pid = wfShellExec(nohup nice -n 19 $cmd  /tmp/stdout.log  echo $!;

put the exit status into a file:
$pid = wfShellExec(nohup nice -n 19 $cmd  /tmp/stdout.log  echo $? 
/tmp/exit.status;

But if I try to get both either my exit status is for the echo pid
command or my pid is for the echo exit status command. It seems like
there should be some shell trick back-reference background tasks or
something

If nothing else I think this could be done with a shell script and pass
in a lot of path targets and use the wait $pid command at the end to
grab the exit code of the background process. Did a quick guess at what
this would look like in that same commit[1], but would rather just do
some command line magic instead of putting a .sh script in the extension.

[1] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/90068

peace,
--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] background shell process control and exit status

2011-06-14 Thread Michael Dale

On 06/14/2011 05:41 PM, Platonides wrote:
 Do you want the command to be run asynchronously or not?
 If you expect the status code to be returned by wfShellExec(), then the 
 process will obviously have finished and there's no need for the PID.
 OTOH if you launch it as a background task, you will want to get the 
 PID, and then call pcntl_waitpid* on it to get the status code.

 *pcntl_waitpid() may not work, because $cmd is unlikely to be a direct 
 children of php. You could also be expecting to check it from a 
 different request. So you would enter into the world of killing -0 the 
 process to check if it's still alive.


Yes the idea is to run the command asynchronously so we can monitor the
transcode progress and kill it if it stops making progress.

Calling pcntl_waitpid with pcntl_fork as Tim mentions may be the way to
get it done.  With the child including the pcntl_waitpid call and the
parent monitoring progress and killing the child if need be.

--michael



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Foundation-l] YouTube and Creative Commons

2011-06-07 Thread Michael Dale

On 06/04/2011 06:43 PM, David Gerard wrote:
 A question that wasn't clear from reading the bug: why is reading a
 file format (WebM) blocked on the entire Timed Media Handler?

It would be complicated to support WebM without an improved player and
transcoding support. All the IE users for example can only decode ogg
with cortado, if we don't use TMH WebM files when embed in articles
would not play for those users. Likewise older versions of firefox only
playback ogg.  Additionally, issues around HD files embedded into
articles is already an issue with users uploading variable bit-rate HD
oggs, giving a far from ideal experience on most Internet connections
and most in-browser playback engines. This would be an issue for
variable bitrate webm files as well ( without the transcoding support of
TMH )

Other features that have been living in the mwEmbed gadget for a long
time like timed text, remote embedding / video sharing, and temporal
media references / embeds are all better supported in TMH as an
extension, so we would be good to move those features over.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Code review process (was: Status of more regular code deployments)

2011-06-01 Thread Michael Dale

On 06/01/2011 08:28 AM, Chad wrote:
 I don't think revert in 72 hours if its unreviewed is a good idea. It
 just discourages people from contributing to areas in which we only
 have one reviewer looking at code.

 I *do* think we should enforce a 48hr revert if broken rule. If you
 can't be bothered to clean up your breakages in within 48 hours of
 putting your original patch in, it must not have been very important.

 -Chad
I think a revert on sight, if broken is fair ... you can always re-add
it in after you fix it ... if its a 'works diffrently than expected'
type issue / not perfectly following coding conventions a 48hr window to
make progress ( during the work week ) sounds reasonable.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Media embedding: oEmbed feedback?

2011-05-02 Thread Michael Dale

sorry for the re-post ( having trouble with the wikitech-l list post
email migration :(

I would also be interested in discussing this in Berlin or otherwise ;)

I can offer some notes about video embedding inline:

On 04/29/2011 03:30 PM, Brion Vibber wrote:
   Enhanced media player goodies like embedding have been slowly coming
along,
   with a handy embedding option now available in the fancy version
of the
   media player running on Commons. This lets you copy a bit of HTML
you can
   paste into your blog or other web site to drop in a video and make it
   playable -- nice! Some third-party sites will also likely be
interested in
   standardish ways of embedding offsite videos from Youtube, Vimeo,
and other
   providers.


It appears the iframe embed method is becoming somewhat standardise way
to share videos. With Youtube, Vimeo, and others providing it as an
option to deliver both flash and html5 players.

The bit of HTML that you copy on commons share video function is just an
iframe ( similar to those other sites ). Timed Media Handler works the
same way using the same url parameter ( embedplayer=yes ) so that we can
seamlessly replace the 'fancy media player' rewrite with a similar embed
player page delivered by the TMH extension [1]

The iframe player lets you sandbox the player when you embed it in
foreign domain contexts, and enables you to deliver the interface that
includes things like the credits screen that parses our description
template page on commons to present credit information and a link back
to the description page.

As iframe embed is relatively standard, we simply have to request that
our domain be white listed for it to be shared on facebook , wordpress etc.

In addition to working as a pure iframe without xss javascript, to
support mashups like the googles player [2] if you include a bit of JS
where you embed the iframe, the mwEmbed player also has an iframe api
that lets you use the HTML5 video api on the iframe as if it was a video
tag in the page. [3]

oEmbed is a nice way to consistently 'discover' embed code and media
properties. Its implementation within mediaWiki would be akin to
supporting RSS or OpenSearch, so I think its something we should try and
do.

As the spec currently stands its api for the embed code rather than an
api for mashups. I think more interesting things could be done in
addition to the iframe, object tag and basic metadata ... like giving
the urls to all the media files, and urls to all the associated timed
text of a given player ... Something like the ROE standard [4] that we (
xiph, annodex ) folks were talking about a while back might be a good
direction to extend oEmbed into. ( Although commercial video service
sites are not likely to be interested in mash-ups outside of their
player hence oEmbed leaning toward 'html' to embed the players...
direct links to associated media is one of those standard ideas that in
theory is good, but does not play well with video service business
models ... but that does not have stop us / oEmbed from promoting it :)

I would also add the TMH adds a separate api entry point to deliver some
of this info such as the urls for all the derivatives related to a
particular media title [5]. I would like to add associated timed text
listing to that videoinfo prop and from there it should not be hard to
adapt that to a ROE or oEmbed v2 type representation.

[1]
http://prototype.wikimedia.org/timedmedia/Main_Page#Iframe_embed_and_viral_sharing
[2] http://code.google.com/apis/youtube/iframe_api_reference.html
[3]
http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/TimedMediaHandler/MwEmbedModules/EmbedPlayer/resources/iframeApi/
[4] http://wiki.xiph.org/index.php/ROE
[5]
http://prototype.wikimedia.org/tmh/api.php?action=querytitles=File:Shuttle-flip.webmprop=videoinfoviprop=derivativesformat=jsonfm



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Status of 1.17 1.18

2011-04-15 Thread Michael Dale

On 04/15/2011 12:07 PM, Brion Vibber wrote:
 Unexercised code is dangerous code that will break when you least expect it;
 we need to get code into use fast, where it won't sit idle until we push it
 live with a thousand other things we've forgotten about.

Translate wiki deserves major props for running a real world wiki on
trunk. Its hard to count all the bugs get caught that way.  Maybe once
the heterogeneous deployment situation gets figured out we could do
something similar with a particular project...

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Syntax-highlighting JS CSS code editor gadget embedding Ace

2011-04-13 Thread Michael Dale

Very cool. Especially given the development trajectory of Ace to become
the eclipse of web IDEs there will be a lot of interesting
possibilities as we could develop our own mediaWiki centric plugins for
the platform.  

I can't help but think about where this is ideally headed ;)

A gitorius type system for easy branching with mediaWiki.org code review
style tools, with in browser editing. With seemless workflows for going
from per user developing and testing on the live site, to commits to
your personal repository, to being reviewed and tested by other
developers, to being enabled by interested users, to being enabled by
default if so desired.

A lot of these workflows could be prototyped without many complicated
infrastructure improvements. Since this basic process is already
happening in a round about way ... ( sometimes in a round about broken way )
 
A developer gadget could include a simple system for switching between
local checkout of the scripts and support pushing a particular local
copy live or in the case of the online Ace editor, bootstrapping a
particular page with the state of your script ( using the draft
extension concept ) so we don't have to save every edit when you want to
test your code.

We could specify a path structure within our existing svn to keep in
sync with all gadgets and site scripts, then have our 'developer gadget'
understand that path structure so you could seamlessly switch between
the local and live gadget. ( I was manually doing something similar in
my gadget development ).  This could also help encourage gadget
centralisation.

We could then also link into the code review system for every site
script and gadget with one click import of a particular version of the
script ( ideally once the script has been seen by other developers ).
Svn commits would not nessisarally be automatically be pushed to the
wiki but edits to the wiki page would always be pushed to the svn. Or
maybe a sign off in code review results in the push from svn to wiki,
but would not want to slow down fixes getting pushed out. We would have
to see what workflows work best for the community.

mmm ... this would would probably work better with git :P ... but
certainly is not a show stopper to experimentation in improving these
workflows.

peace,
--michael


On 04/12/2011 07:40 PM, Brion Vibber wrote:
 While pondering some directions for rapid prototyping of new UI stuff, I
 found myself lamenting the difficulty of editing JS and CSS code for
 user/site scripts and gadgets:

 * lots of little things to separately click and edit for gadgets
 * no syntax highlighting in the edit box
 * no indication of obvious syntax errors, leading to frequent edit-preview
 cycles (especially if you have to turn the gadget back off to edit
 successfully!)
 * no automatic indentation!
 * can't use the tab key

 Naturally, I thought it might be wise to start doing something about it.
 I've made a small gadget script which hooks into editing of JS and CSS
 pages, and embeds the ACE code editor (http://ace.ajax.org -- a component of
 the Cloud9 IDE, formerly Skywriter formerly Mozilla Bespin). This doesn't
 fix the usability issues in Special:Gadgets, but it's a heck of a lot more
 pleasant to edit the gadget's JS and CSS once you get there. :)

 The gadget is available on www.mediawiki.org on the 'Gadgets' tab of
 preferences. Note that I'm currently loading the ACE JavaScript from
 toolserver.org, so you may see a mixed-mode content warning if you're
 editing via secure.wikimedia.org. (Probably an easy fix.)

 Go try it out! http://www.mediawiki.org/wiki/MediaWiki:Gadget-CodeEditor.js

 IE 8 kind of explodes and I haven't had a chance to test IE9 yet, but it
 seems pretty consistently nice on current Firefox and Chrome and (barring
 some cut-n-paste troubles) Opera.

 I'd really love to be able to use more content-specific editing tools like
 this, and using Gadgets is a good way to make this sort of tool available
 for testing in a real environment -- especially once we devise some ways to
 share gadgets across all sites more easily. I'll be similarly Gadget-izing
 the SVG-Edit widget that I've previously done as an extension so folks can
 play with it while it's still experimental, but we'll want to integrate them
 better as time goes on.

 -- brion
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] writing phpunit tests for extensions

2011-04-04 Thread Michael Dale

I had a bit of a documentation challenge approaching the problem of
writing phpunit test for extensions, mostly because many of the
extensions do this very differently and the manual did not have any
recommendations.

It appears many extension have custom bootstraping code ( somewhat hacky
path discovery and manual loading of core mediawiki files, and don't
necessarily register their tests in a consistent way.)

I wrote up a short paragraph of what I would recommend here:
http://www.mediawiki.org/wiki/Manual:Unit_testing#Writing_Unit_Test_for_Extensions

If that makes sense, I will try and open up some bugs on the extensions
with custom bootstraping code, and I would recommend we commit an
example test tests/phpunit/suite.extension.xml file for exclusively
running extension tests.

Eventually it would be ideal to be able to 'just test your extension'
from the core bootstraper (ie dynamically generate our suite.xml and
namespace the registration of extension tests) ... but for now at least
not having to wait for all the core tests as you write you extension
tests and some basic documentation on how to do seems like a step forward.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] writing phpunit tests for extensions

2011-04-04 Thread Michael Dale

On 04/04/2011 02:20 PM, Platonides wrote:
 Michael Dale wrote:
 Eventually it would be ideal to be able to 'just test your extension'
 from the core bootstraper (ie dynamically generate our suite.xml and
 namespace the registration of extension tests) ... but for now at least
 not having to wait for all the core tests as you write you extension
 tests and some basic documentation on how to do seems like a step forward.

 --michael
 If your tests are in just one file, you can simply pass it as a
 parameter to tests/phpunit/phpunit.php

that's cool. We should add that info to the phpunit.php --help output,
and to the unit testing wiki page.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Focus on sister projects

2011-04-03 Thread Michael Dale

On 04/02/2011 04:08 PM, Ryan Kaldari wrote:
 2. Creating The Complete Idiot's Guide to Writing MediaWiki Extensions
 and The Complete Idiot's Guide to Writing MediaWiki Gadgets (in jQuery)
+1 ... Beyond the guide we could win a lot by centralising some of the 
scripts and libraries on mediawiki.org and establishing best practices 
for things like gadget localisation.

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Future: Love for the sister projects!

2011-04-03 Thread Michael Dale

On 04/03/2011 10:56 AM, Brion Vibber wrote:
 Harder, but very interesting in the medium to long-term:

We would do good to survey and analyse other gadget, widget, 
add-on systems and communities that exist in web platforms.  Not to 
say that wikipedias needs are the same, just that there are probably a 
lot of ideas to borrow from, and a good gadgets plan will have a few 
phases of implementation.

Some anecdotal notes:

* Gadget or widget have to get per user permission confirmation before 
it can take certain actions on your behalf. If we have an iframe 
postMessage api proxy bridge, we enable permissions of an open 
sand-boxed wiki gadget site we could potentially lower the criteria for 
entry and be a bit better than including random JS on your userpage .js 
page.

* There is very fluid search and browsing interfaces to find, 'install' 
and share gadgets / add-ons. This includes things like ratings, usage 
statics, 'share this', author information etc.

** Visibility. Many editors and viewers probably have no idea gadgets 
exist. With the exception of projects globally enabling a gadget, many 
feature are pretty much hidden from users. Its sort of a chikin and egg 
issue but in addition to highlighting content, the sites main pages, may 
also highlight good usage of in-site tools and features. Like commons 
featuring a densly annotated image to highlight the image annotator, or 
the community portal of wikipedia directly linking to an interactive js 
gadget that enables a particular patrol work flow or article assessment 
task. The withJS system is kind of a hack for direct link into a 
gadget feature which I have used a lot, but a more formal easy opt-in 
mechanism would be nice...

* Check out https://addons.mozilla.org/en-US/developers/  We have decent 
documentation for extensions and core mediawiki development, but the 
gadget effort is somewhat ad-hock, not very centralised and best 
practices are not very well documented ( although recent efforts are a 
step in the right direction :)


peace,
michael




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Where exactly are the video and audio players at?

2011-03-24 Thread Michael Dale

On 03/24/2011 04:45 AM, Joseph Roberts wrote:
 Actually, looking through OggHandler, I do think that developing a
 seperate entity may work well.
 I'm not quite sure what is wanted by the general public and would like
 to do what is wanted by the majority, not just wat would be easiest or
 even the best.
 What would be the best way to implement a HTML5 player in MediaWiki?

 TIA - Joseph Roberts


There is the Extension:TimedMediaHandler, that implements multi-format
multi-bitrate transocding with auto source selection, html5 player
interface, timed text, temporal media fragments, gallery and search
pop-up players, viral iframe sharing / embedding, etc.

Demo page here:
http://prototype.wikimedia.org/timedmedia/Main_Page

peace,
michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] ie9 webm components

2011-03-16 Thread Michael Dale

http://blogs.msdn.com/b/ie/archive/2011/03/16/html5-video-update-webm-for-ie9.aspx

It appears to be a little rough around the edges, but should bode well
for Wikimedia video support as IE 9 starts to be pushed out to windows
machines and ideally we won't have ie7 and 8 for as long as we have had
IE6 ;)

If you have not already seen it, the TimedMediaHandler extension
supports transcoding to both webm and ogg for mediawiki video assets: 
http://prototype.wikimedia.org/timedmedia/Main_Page

I will integrate links to http://tools.google.com/dlpage/webmmf/ for IE9
users in the mwEmbed player once the components are working a bit more
smoothly.

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] use jQuery.ajax in mw.loader.load when load script

2011-02-18 Thread Michael Dale

On 02/18/2011 01:01 PM, Roan Kattouw wrote:
 2011/2/18 Philip Tzou philip@gmail.com:
 jQuery's ajax method provides a better way to load a javascript, and it can
 detect when the script would be loaded and excute the callback function. I
 think we can implement it to our mw.loader.load. jQuery.ajax provides two
 way (ajax or inject) to load a javascript, you should set cache=true to use
 the inject one.

 I guess we could use this when loading stuff from arbitrary URLs in
 the future, but for normal module loads the
 mediaWiki.loader.implement() call in the server output works fine.

Client side there is the mediaWiki.loader.using call which allows you to
supply a callback, unfortunately there are some bugs in debug mode
output and implement gets called before the scripts are actually ready,
but it should work for production mode.

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] File licensing information support

2011-01-24 Thread Michael Dale

On 01/22/2011 01:15 PM, Bryan Tong Minh wrote:
 Handling metadata separately from wikitext provides two main
 advantages: it is much more user friendly, and it allows us to
 properly validate and parse data.

This assumes wikitext is simply a formatting language, really its a data
storage, structure and presentation language. You can already see this
in place by the evolution of templates as both data and presentation
containers. It seems like a bad idea to move away from leveraging
flexible data properties used in presentation.

In commons for we have Template:Information that links out into numerous
data triples for assets presentation. ( ie Template:Artwork,
Template:Creator,  Template:Book with sub data relationships like
Artwork.Location referencing the Institution template. If tied to SMW
backed you could say give me artwork in room Pavillion de Beauvais at
the louvre, that is missing a created on date.

We should focus on apis for template editing,
Extension:Page_Object_Model seemed like a step in the right direction
but not  Something that let you edit structured data across nested
template objects and we could stack validation ontop of that would let
us leverage everything that has been done and keep things wide open for
what's done in the future.

Most importantly we need clean high level apis that we can build GUIs
on, so that the flexibility of the system does not hurt usability and
functionality.

 Having a clear separate input text field Author:  is much more
 user friendly {{#fileauthor:}}, which is so to say, a type of obscure
 MediaWiki jargon. I know that we could probably hide it behind a
 template, but that is still not as friendly as a separate field. I
 keep on hearing that especially for newbies, a big blob of wikitext is
 plain scary. We regulars may be able to quickly parse the structure in
  {{Information}}, but for newbies this is certainly not so clear.
 We actually see that from the community there is a demand for
 separating the meta data from the wikitext -- this is after all why
 they implemented the uselang= hacked upload form with a separate text
 box for every meta field.

I don't know... see all the templates mentioned above... To be sure, I
think we need better interfaces for interacting with templates.

 Also, a separate field allows MediaWiki to understand what a certain
 input really means. {{#fileauthor:[[User:Bryan]]}} means nothing to
 MediaWiki or re-users, but Author: Bryan___ [checkbox] This is a
 Commons username can be parsed by MediaWiki to mean something. It
 also allows us to mass change for example the author. If I want to
 change my attribution from Bryan to Bryan Tong Minh, I would need
 to edit the wikitext of every single upload, whereas in the new system
 I go to Special:AuthorManager and change the attribution.

A semantic mediwiki like system retains this meaning for mediawiki to
interact with at any stage of data [re]presentation, and of course
supports flexible meaning types.

 Similar to categories, and all otheruser edited metadata.
 Categories is a good example of why metadata does not belong in the
 wikitext. If you have ever tried renaming a category... you need to
 edit every page in the category and rename it in the wikitext. Commons
 is running multiple bots to handle category rename requests.

 All these advantage outweigh the pain of migration (which could
 presumably be handled by bots) in my opinion.

Unless your category was template driven, in which case you just update
the template ;) If your category was instead magically associated with
the page outside of template built wiki page text, how do you build
procedurally build data associations?


--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!

2011-01-21 Thread Michael Dale

On 01/21/2011 08:21 AM, Chad wrote:
 While I happen to think the licensing issue is rather bogus and
 doesn't really affect us, I'm glad to see it resolved. It outperforms
 our current solution and keeps the same behavior. Plus as a bonus,
 the vertical line smushing is configurable so if we want to argue
 about \n a year from now, we can :)

Ideally we will be using closures by then and since it rewrites
functions, variable names and sometimes collapses multi-line
functionality, new line preservation will be a mute point. Furthermore,
Google even has a nice add-on to firebug [1] for source code mapping.
Making the dead horse even more dead.

I feel like we are suck back in time, arguing about optimising code that
came out eons ago in net time ( more than 7 years ago ) There are more
modern solutions that take into consideration these concerns and do a
better job at it. ( ie not just a readable line but a pointer back to
the line of source code that is of concern )

[1] http://code.google.com/closure/compiler/docs/inspector.html

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] File licensing information support

2011-01-21 Thread Michael Dale

On 01/21/2011 02:45 AM, Alex Brollo wrote:
 The interest of wikisource project for a formal and standardyzed set of book
 metadata (I presume from Dublin Core) into a database table  is obviuos.
 Some preliminary tests into it.source suggest that templates and Labeled
 Section Transclusion extension could have a role as existing wikitext
 conteiners for semantized variables; the latter perhaps more interesting
 than the former one, since their content can be accessed directly from any
 page

 I'd like that book metadata would be considered from the beginning of this
 interesting project.

 Alex

This quickly dove tails into Semantic MediaWiki discussion... which
there are other threads on this list to reference.  There is a wiki data
summit / meeting coming up, where these issues will likely be discussed.
Maybe we could start eliciting requirements and needs of projects like
what you describe for wikisource and others that have been listed
elsewhere on a pre-meeting project page, this way we can be sure to hit
on all these items during the meeting.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] File licensing information support

2011-01-20 Thread Michael Dale

On 01/20/2011 05:00 PM, Platonides wrote:
 I would have probably gone by the page_props route, passing the metadata
 from the wikitext to the tables via a parser function.

I would also say its probably best to pass metadata from the wikitext to
the tables via a parser function.  Similar to categories, and all other
user edited metadata. This has the disadvantage that its not easy 'as
easy' to edit via structured api entry point,  but has the advantage of
working well with all the existing tools, templates and versioning. 

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Farewell JSMin, Hello JavaScriptDistiller!

2011-01-20 Thread Michael Dale

As mentioned in the bug, it would be nice to have configurable support
for the closure-compiler as well ;) ( I assume Apache licence is
compatible? )

Has anyone done any tests to see if there are any compatibility issues
with SIMPLE_OPTIMIZATIONS with a google closure minification hook?

--michael

On 01/20/2011 04:13 PM, Trevor Parscal wrote:
 For those of you who didn't see bug 26791, our use of JSMin has been 
 found to conflict with our GPL license. After assessing other options ( 
 https://bugzilla.wikimedia.org/show_bug.cgi?id=26791#c8 ) Roan and I 
 decided to try and use the minification from JavaScriptPacker, but not 
 its overly clever but generally useless packing techniques. The result  
 is a minifier that outperforms our current minifier in both how quickly 
 it can minify data and how small the minified output is. 
 JavaScriptDistiller, as I sort of randomly named it, minifies JavaScript 
 code at about 2x the speed of Tim's optimized version of JSMin, and 4x 
 the speed of the next fastest PHP port of JSMin (which is generally 
 considered the standard distribution).

 Similar to Tim's modified version of JSMin, we chose to retain vertical 
 whitespace by default. However we chose not to retain multiple 
 consecutive empty new lines, which are primarily seen where a large 
 comment block has been removed. We feel there is merit to the argument 
 that appx. 1% bloat is a reasonable price to pay for making it easier to 
 read production code, since leaving each statement on a line by itself 
 improves readability and users will be more likely to be able to report 
 problems that are actionable. We do not however find the preservation of 
 line numbers of any value, since in production mode most requests are 
 for many modules which are concatenated, making line numbers for most of 
 the code useless anyways.

 This is a breakdown based on ext.vector.simpleSearch

 * 3217 bytes (1300 compressed)
 * 2178 bytes (944) after running it through the version of JSMin that 
 was in our repository. Tim modified JSMin to be faster and preserve line 
 numbers by leaving behind all vertical whitespace.
 * 2160 bytes (938 compressed) after running it through 
 JavaScriptDistiller, which applies aggressive horizontal minification 
 plus collapsing multiple consecutive new lines into a single new line.
 * 2077 bytes (923 compressed) after running it through 
 JavaScriptDistiller with the vertical space option set to true, which 
 applies aggressive horizontal minification as well as some basic 
 vertical minification. This option is activated through  
 $wgResourceLoaderMinifyJSVerticalSpace, which is false by default.

 The code was committed in r80656.

   - Trevor (and Roan)


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JavaScript access to uploaded file contents: SVGEdit gadget needs ApiSVGProxy or CORS

2011-01-04 Thread Michael Dale

On 01/03/2011 02:22 PM, Brion Vibber wrote:
 Since ApiSVGProxy serves SVG files directly out on the local domain as their
 regular content type, it potentially has some of the same safety concerns as
 img_auth.php and local hosting of upload files. If that's a concern
 preventing rollout, would alternatives such as wrapping the file data 
 metadata into a JSON structure be acceptable?

hmm... Is img_auth widely used? Can we just disable svg api data access
if $wgUploadPath includes imageAuth ... or add a configuration variable
that states if img_auth is an active entry point?  Why dont we think
about the problem diffrently and support serving images through the api
instead of maintaining a speperate img_auth entry point?

Is the idea that our asset scrubbing for malicious scripts or embed
image html tags to protect against IE's lovely 'auto mime' content type
is buggy? I think the majority of mediaWiki installations are serving
assets on the same domain as the content. So we would do good to address
that security concern as our own. ( afaiak we already address this
pretty well) Furthermore we don't want people to have to re-scrub once
they do access that svg data on the local domain...

It would be nice to serve up diffrent content types data over the api
in a number of use cases. For example we could have a more structured
thumb.php entry point or serve up video thumbnails at requested times
and resolutions. This could also clean up Neil's upload wizard per-user
temporary image store by requesting these assets through the api instead
of relying on obfuscation of the url. Likewise the add media wizard
presently does two requests once it opens the larger version of the image.

Eventually it would be nice to make more services available like svg
localisation / variable substitution and rasterization. ( ie give me
engine_figure2.svg in Spanish at 600px wide as a png )

It may hurt caching to serve everything over jsonp since we can't set
smaxage with  callback=randomString urls. If its just for editing its
not a big deal, untill some IE svg viewer hack starts getting all svg
over jsonp ;) ... Would be best if we could access this data without
varying urls.

 Alternately, we could look at using HTTP access control headers on
 upload.wikimedia.org, to allow XMLHTTPRequest in newer browsers to make
 unauthenticated requests to upload.wikimedia.org and return data directly:

 https://developer.mozilla.org/En/HTTP_Access_Control

I vote yes! ... This would also untaint video canvas data that I am
making more and more use of in the sequencer ... Likewise we could add a
crossdomain.xml file so IE flash svg viewers can access the data.

 In the meantime I'll probably work around it with an SVG-to-JSONP proxy on
 toolserver for the gadget, which should get things working while we sort it
 out.
Sounds reasonable :)

We should be able to upload the result via the api on the same domain
as the editor so would be very fun to enable this for quick svg edits :)

peace,
--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JavaScript access to uploaded file contents: SVGEdit gadget needs ApiSVGProxy or CORS

2011-01-04 Thread Michael Dale

On 01/04/2011 09:57 AM, Roan Kattouw wrote:
 The separate img_auth.php entry point is needed on wikis where reading
 is restricted (private wiis), and img_auth.php will check for read
 permissions before it outputs the file. The difference between the
 proxy I wrote and img_auth.php is that img_auth.php just streams the
 file from the filesystem (which, on WMF, will hit NFS every time,
 which is bad) whereas ApiSVGProxy uses an HTTP request (which will hit
 the image Squids, which is good).

So ... it would be good to think about moving things like img_auth.php
and thumb.php over to an general purpose api media serving module no?

This would help standardise how media serving is extended, reduce
extra entry points and as you point out above let us use more uniformly
proxy our back-end data access over HTTP to hit the squids instead of
NFS where possible.

And as a shout out to Trevors mediawiki 2.0 vission, eventually enable
more REST like interfaces within  mediaWiki media handing.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikimedia Storage System ( was JavaScript access to uploaded...)

2011-01-04 Thread Michael Dale

On 01/04/2011 01:12 PM, Neil Kandalgaonkar wrote:
 We've narrowed it down to two systems that are being tested right now,
 MogileFS and OpenStack. OpenStack has more built-in stuff to support
 authentication. MogileFS is used in many systems that have an
 authentication layer, but it seems you have to build more of it from
 scratch.


 Authentication is really a nice-to-have for Commons or Wikipedia right
 now. I anticipate it being useful for a handful of cases, which are
 both more anticipated than actual right now:
   - images uploaded but not published (a la UploadWizard)
   - forum avatars (which can viewed by anyone, but can only be edited
 by the user they belong to)


hmm. I think it would ( obviously? ) be best to handle media
authentication at the mediaWiki level with just a simple private /
public accessible classification for the backed storage system. Things
that are private have to go through the mediaWiki api where you can
leverage all the existing extendible credential management.

Also important to keep things simple for 3rd parties that are not using
a clustered filesystem stack, easier to map web accessible dir vs not ..
than any authentication managed within the storage system.

Image 'editing' / uploading already includes basic authentication ie:
http://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Upload_permissions
User avatars would be a special case of


 I think thumbnail and transformation servers (they should also do
 stuff like rotating things on demand) are separate from how we store
 things, and will just be acting on behalf of the user anyway. So they
 don't introduce new requirements to image storage. Anybody see
 anything problematic about that?

I think managing storage of procedural derivative assets differently
than original files is pretty important. Probably one of the core
features of a Wikimedia Storage system.

Assuming finite storage it would be nice to specify we don't care as
much if we lose thumbnails vs losing original assets. For example when
doing 3rd party backups or dumpswe don't need all the derivatives to
be included.

We don't' need need to keep random resolutions derivatives of old
revisions of assets around for ever, likewise improvements to SVG
rasterization or improvements to transcoding software would mean
expiring derivatives

When mediaWiki is dealing with file maintenance it should have to
authenticate differently when removing, moving, or overwriting orginals
vs derivatives i.e independent of DB revision numbers or what mediaWiki
*thinks* it should be doing.

For example only upload ingestion nodes or modes should have write
access to the archive store. Transcoding or thumbnailing or maintenance
nodes or modes should only have read-only access to archive originals
and write access to derivatives.



 As for things like SVG translation, I'm going to say that's out of
 scope and probably impractical. Our experience with the Upload Wizard
 Licensing Tutorial shows that it's pretty rare to be able to simply
 plug in new strings into an SVG and have an acceptable translation. It
 usually needs some layout adjustment, and for RTL languages it needs
 pretty radical changes.

 That said, it's an interesting frontier and it would be awesome to
 have a tool which made it easier to create translated SVGs or indicate
 that translations were related to each other. One thing at a time though.

I don't think its that impractical ;) SVG includes some conventions for
layout. With some procedural sugar could be improved, ie container sizes
dictating relative character size. It may not be perfectly beautiful but
certainly everyone translating content should not have to know how to
edit SVG files, likewise software can facilitate a separate svg layout
expert to come in later and improve on the automated derivative.

But your correct its not part really part of storage considerations. But
is part of thinking about the future of access to media streams via the
api.

Maybe the base thing for the storage platform to consider in this thread
is: access to media streams via the api or if its going to try and
manage a separate entry point outside of mediawiki. I think public
assets going over the existing squid - http file server path and
non-public asset going trough an api entry point would make sense.

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] How would you disrupt Wikipedia?

2010-12-30 Thread Michael Dale

Looking over the thread, there are lots of good ideas. Its really
important to have some plan towards cleaning up abstractions between
structured data, procedures in representation, visual
representation and tools for participation.

But, I think its correct to identify the social aspects of the projects
as more critical than purity of abstractions within wikitext. Tools,
bots and scripts and clever ui components can abstract away some of the
pain of the underlining platform as long as people are willing to accept
a bit of abstraction leakage / lack of coverage in some areas as part of
moving to something better.

One area that I did not see much mention of in this thread is automated
systems for reputation. Reputation systems would be useful both for user
interactions and for gauging expertise within particular knowledge domains.

Social capital within wikikmedia projects is presently stored in
incredibly unstructured ways and has little bearing on user privileges
or how the actions of others are represented to you, and how your
actions are represented to others. Its presently based on traditional
small scale capacities of individuals to gauge social standing within
their social networks and or to read user pages.

We can see automatic reputation system emerging anytime you want to
share anything online be it making a small loan to trading used DVDs.
Sharing information should adopt some similar principals.

There has been some good work done in this area with wikitrust system (
and other user moderation / karma systems ). Tying that data into smart
interface flows that reward positive social behaviour and productive
contributions, should make it more fun to participate in the projects
and result in more fluid higher quality information sharing.

peace,
--michael

On 12/29/2010 01:31 AM, Neil Kandalgaonkar wrote:
 I've been inspired by the discussion David Gerard and Brion Vibber 
 kicked off, and I think they are headed in the right direction.

 But I just want to ask a separate, but related question.

 Let's imagine you wanted to start a rival to Wikipedia. Assume that you 
 are motivated by money, and that venture capitalists promise you can be 
 paid gazillions of dollars if you can do one, or many, of the following:

 1 - Become a more attractive home to the WP editors. Get them to work on 
 your content.

 2 - Take the free content from WP, and use it in this new system. But 
 make it much better, in a way Wikipedia can't match.

 3 - Attract even more readers, or perhaps a niche group of 
 super-passionate readers that you can use to build a new community.

 In other words, if you had no legacy, and just wanted to build something 
 from zero, how would you go about creating an innovation that was 
 disruptive to Wikipedia, in fact something that made Wikipedia look like 
 Friendster or Myspace compared to Facebook?

 And there's a followup question to this -- but you're all smart people 
 and can guess what it is.



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] InlineEditor / Sentence-Level Editing: usability review

2010-11-29 Thread Michael Dale

On 11/29/2010 07:56 AM, Roan Kattouw wrote:
 2010/11/29 Jan Paul Posma jp.po...@gmail.com:
 Full interview videos will be available on Wikimedia Commons somewhere next 
 month. They are in Dutch, though.

 Michael, can we subtitle those with mwEmbed magic?

 Roan Kattouw (Catrope)

We can. We can even subtitle them with the slick universal subtitles
interface ;)
http://techblog.wikimedia.org/2010/10/video-labs-universal-subtitles-on-commons/

Hopefully by then I will have time to add a little language selection
dialog at start-up, for now you would have to move the English
subtitle into the do Dutch subtitle name after you complete your
transcription.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Resource Loader problem

2010-11-10 Thread Michael Dale

The code is not spread across many files.. its a single mw.Parser.js
file. Its being used in my gadget and Neil's upload wizard.   I agree
the parser is not the ideal parser, its not feature complete, is not
very optimised, and it was hacked together quickly. But it passes all
the tests and matches the output of php for all the messages across all
the languages. I should have time in the next few days to re merge /
clean it up a bit if no one else is doing it.

It should be clear who is doing what.

The parser as is ... is more of a starting point than a finished
project. But it starts by passing all the tests... If that useful we can
plop it in there.

an old version of the test file is here. I have a ported / slightly
cleaner version in a patch
http://prototype.wikimedia.org/s-9/extensions/JS2Support/tests/testLang.html

it also includes a test file that confirms the transforms work across a
sample set of messages. Its not clear to me how the current test files /
system scales ... Mostly for Krinkle:  The mediawiki.util.test.js seem
to always include itself when in debug mode. And why does
mediawiki.util.test.js not define an object by name
mediawiki.util.test it instead defines mediawiki.test

also:

if (wgCanonicalSpecialPageName == 'Blankpage' 
mw.util.getParamValue('action') === 'mwutiltest') {

Seems gadget like.. this logic can be done on php side no? Why not
deliver specific test payloads for specific test entry points? if you
imagine we have dozes of complicated tests systems with sub components
the debug mode will become overloaded with js code that is never running.

--michael

On 11/10/2010 10:56 AM, Roan Kattouw wrote:
 2010/11/10 Dmitriy Sintsov ques...@rambler.ru:
 * Trevor Parscal tpars...@wikimedia.org [Wed, 10 Nov 2010 00:16:27
 -0800]:
 Well, we basically just need a template parser. Michael has one that
 seems to be working for him, but it would need to be cleaned up and
 integrated, as it's currently spread across multiple files and
 methods.
 Do you like writing parsers?

 Maybe my knowledge of MediaWiki is not good enough, but aren't the local
 messages only provide the basic syntax features like {{PLURAL:||}}, not
 a full Parser with template calls and substitutions? I never tried to
 put real template calls into messages. Rewriting the whole Parser in
 Javascript would be a lot of work. Many people have already failed to
 make alternative parsers fully compatible. And how would one call the
 server-side templates, via AJAX calls? That would be inefficient.

 We're not looking for a full-blown parser, just one that has a few
 basic features that we care about. The current JS parser only
 supports expansion of message parameters ($1, $2, ...), and we want
 {{PLURAL}} support too. AFAIK that's pretty much all we're gonna need.
 Michael Dale's implementation has $1 expansion and {{PLURAL}}, AFAIK,
 and maybe a few other features.

 I am currently trying to improve my Extension:WikiSync, also I have
 plans to make my another extensions ResourceLoader compatible.

 I think {{PLURAL}} is an important feature for ResourceLoader, and if
 no volunteer wants to implement it, I think a staff developer should.

 However, if the most of work has already been done, I can take a look,
 but I don't have the links to look at (branches, patches). I just don't
 know how much time would it take. Sorry.
 I believe most of the work has already been done, yes, but I've never
 seen Michael's code and I don't know where it is (maybe someone who
 does can post a link?).

 Roan Kattouw (Catrope)

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Michael Dale

On 10/25/2010 12:02 PM, Erik Moeller wrote:
 Hello all,

 for some types of resources, it's desirable to upload source files
 (whether it's Blender, COLLADA, Scribus, EDL, or some other format),
 so that others can more easily remix and process them. Currently, as
 far as I know, there's no way to upload these resources to Commons.

 What would be the arguments against allowing administrators to upload
 arbitrary ZIP files on Wikimedia Commons, allowing the Commons
 community to develop policy and process around when such archived
 resources are appropriate? An alternative, of course, would be to
 whitelist every possible source format for admins, but it seems to me
 that it would be a good general policy to not enable additional
 support for formats that aren't officially supported (reduces
 confusion among users about what's permitted -- there's only one file
 format they can't use).

 Thoughts?

 Thanks,
 Erik



Its most ideal if we actually support these formats, so we can do thing 
like thumbnails, basic meta data etc. Failing that its better to support 
a given file extension, then it is to support zip files. This way if in 
'the future' we add support for X file format, then we have X format 
files stored consistently so we can support representation of that file 
format.

If we add blanket support for 'throw whatever you want' into a zip file, 
it will be difficult to give a quality representation of that asset in 
the future. ( other than as a zip file with multiple sub assets ).

If for example someone writes a diff engine for representing 3d model 
transformations, we won't as easily be able to plug-in that tool, if we 
don't have a consistent storage model for that file format.

That being said their may be some composite asset sets that lack 
container systems, in which case it would not be bad support some open 
container format.

The number of formats or multimedia asset compositing systems that are 
not web representable with JavaScript engines or natively supported in 
the browser should be on a dramatic decline in the next decade, so best 
to just focus on support for such formats.

For example we prefer svg uploads to a zip file with an illustrator 
assets, because svg is representable in the browser, there are 
javascript based engines for editing svg 
[http://svg-edit.googlecode.com/svn/branches/2.4/editor/svg-editor.html] 
etc. Likewise for 3d model representation with the COLLADA format, 
(although much more in its infancy at this point in time. )

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader Debug Mode

2010-09-30 Thread Michael Dale

 This is getting a little out of hand? People are going to spend more
time talking about the potential for minification errors or sub
optimisation cost criteria then we will ever actually be running into
real minification errors or any real readability issue. Reasonable
efforts are being made to make the non-minified versions easily
accessible. We can add additional comment to every outputted file that
says request me with ?debug=true to get the source code version, add a
user preference, add a cookie, and a view non-minfied source link to the
bottom every page...

But try to keep in mind the whole point of minification is identical
program execution while minimising the file size!  If we are to optimise
with random adjustments for readability we are optimising in two
opposite directions, and every enhacment in the direction of optimized
pacakge code delivery could potentially go against the 'readability'
optimisation. We are already commited to supporting two modes one that
optimized readability with raw source code delivery and one that is
optimised for small packaged delivery. No sense in setting readability
as criteria for the packaged mode since the 'readability' mode will
always do 'readability' better.

--michael

On 10/01/2010 12:38 AM, Trevor Parscal wrote:
   I was hardly making a case for how amazingly expensive it was. I was 
 running some basic calculations that seemed to support your concept of 
 fairly cheap, but chose to mention that it's still not free.

 - Trevor

 On 9/30/10 9:30 PM, Tim Starling wrote:
 On 01/10/10 04:35, Trevor Parscal wrote:
 OK, now I've calculated it...

 On a normal page view with the Vector skin and the Vector extension
 turned on there's a 2KB difference. On an edit page with the Vector skin
 and Vector and WikiEditor extensions there's a 4KB difference.

 While adding 2KB to a request for a person in a remote corner of the
 world on a 56k modem will only add about 0.3 seconds to the download,
 sending 2,048 extra bytes to 350 million people each month increases our
 bandwidth by about 668 gigabytes a month.
 We don't pay by volume (GB per month), we pay by bandwidth (megabits
 per second at the 95th percentile). They should be roughly
 proportional to each other, but to calculate a cost we have to convert
 that 668GB figure to a percentage of total volume.

 I took this graph:

 http://www.nedworks.org/~mark/reqstats/trafficstats-monthly.png

 And I used the GIMP histogram tool to integrate the outgoing part for
 30 days between week 34 and week 37. The result was 31,824 pixels of
 blue and 20,301 pixels of green, which I figure is about 2113
 TB/month. So on your figure, the cost of adding line breaks would be
 about 0.03% of whatever the bandwidth bill for that month is. I don't
 have that number to hand, but I suspect 0.03% of it is not going to be
 very much. For 2009-2010 there was a budget of about $1M for internet
 hosting, of which bandwidth is a part, and 0.03% of that entire
 budget category is only $25 per month.

 I think your 668GB figure is too low, because current uniques is more
 like 390M per month, and because some unique visitors will request the
 JS more than once. You can double it if you think it would help you
 make your case.

 I don't know what that kind of
 bandwidth costs the foundation, but it's not free.
 Developer time is not free either.

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader, now in trunk!

2010-09-09 Thread Michael Dale

On 09/09/2010 10:56 AM, Trevor Parscal wrote:
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
  
 We would need to vary on that cookie, but yes, this seems like a cool idea.

 - Trevor



Previously when we had this conversation, I liked the idea of setting a 
user preference.
http://lists.wikimedia.org/pipermail/wikitech-l/2010-May/047800.html

This is easiset to setup with the existing setup, since we already have 
cache destroying things in the preferences anyway.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader, now in trunk!

2010-09-08 Thread Michael Dale

On 09/07/2010 01:39 AM, Tim Starling wrote:
 I think it's ironic that this style arises in JavaScript, given that
 it's a high-level language and relatively easy to understand, and that
 you could make a technical case in favour of terseness. C has an
 equally effective minification technique known as compilation, and its
 practitioners tend to be terse to a fault. For instance, many Linux
 kernel modules have no comments at all, except for the license header.


I would quickly add that it would be best to promote JSDoc style 
comments ( as I have slowly been doing for mwEmbed ). This way 
eventually we could have ( Google Closure Compiler ) read these comments 
to better optimize JavaScript compilation:

http://code.google.com/closure/compiler/docs/js-for-compiler.html

Also we can create pretty JSDoc style documentation for people that are 
into that sort of thing.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader, now in trunk!

2010-09-08 Thread Michael Dale

On 09/08/2010 06:28 AM, Roan Kattouw wrote:
 I don't believe we should necessarily support retrieval of arbitrary
 wiki pages this way, but that's not needed for Gadgets: there's a
 gadgets definition list listing all available gadgets, so Gadgets can
 simply register each gadget as a module (presumably named something
 like gadget-gadgetname).

This is of course already supported, just not in 'grouped requests'. 
Open up your scripts tab on a fresh load of 
http://commons.wikimedia.org/wiki/Main_Page Like 24 or so of the 36 
scripts requests on commons are 'arbitrary wiki pages' requested as 
javascript:

http://commons.wikimedia.org/w/index.php?title=MediaWiki:AjaxQuickDelete.jsaction=rawctype=text/javascript

Not gadgets in the php extensions sense. Ie MediaWiki:Common.js does a 
lot of loading and the gadget set is not defined in php on the server.  
The resource loader should minimally let you group MediaWiki namespace 
javascript and css.

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader, now in trunk!

2010-09-08 Thread Michael Dale

On 09/08/2010 11:25 AM, Roan Kattouw wrote:
It's defined on a MediaWiki: page, which is accessed by the server to
generate the Gadgets tab in Special:Preferences. There is sufficient
server-side knowledge about gadgets to implement them as modules,
although I guess we might as well save ourselves the trouble and load
them as wiki pages,

We should have an admin global enabled 'gadgets' enable system ( with
support for turning it on per user group ie all users, admin users etc.
Each gadget should define something like:
MediaWiki:Gadget-loader-ImageAnnotator.js where it has the small bit
that is presently stored in free text in MediaWiki:Common.js ie:

/ ImageAnnotator **
* Globally enabled per
*
http://commons.wikimedia.org/w/index.php?title=Commons:Village_pumpoldid=26818359#New_interface_feature
*
* Maintainer: [[User:Lupo]]
/

if (wgNamespaceNumber != -1 wgAction (wgAction == 'view' ||
wgAction == 'purge')) {
// Not on Special pages, and only if viewing the page
if (typeof (ImageAnnotator_disable) == 'undefined' ||
!ImageAnnotator_disable) {
// Don't even import it if it's disabled.
importScript ('MediaWiki:Gadget-ImageAnnotator.js');
}
}

Should go into the gadget loader file and of course instead of
importScript, some loader call that aggregates all the loader load calls
for a given page ready time. It should ideally also support some sort of
grouping strategy parameter. We should say something like packages that
are larger than 30k or used on a single page should not be grouped. As
to avoid mangled cache effects escalating into problematic scenarios.
As we briefly discussed, I agree with Trevor that if the script is small
and more or less widely used its fine to retransmit the same package in
different contexts to avoid extra requests on first visit.

But it should be noted that separating requests can result in ~less~
requests. ie imagine grouping vs separate request where page 1 uses
resource set A, B and page 2 uses resource set A, C then page 3 uses
A,B,C you still end up doing 3 requests across the 3 page views, except
with 'one request' strategy you resend A. The forth page that just uses
B, C you can pull those from cache and do zero requests, or resend B, C
if you always go with a 'single request'. Of course as you add more
permutations like page 5 that uses just A just B or just C it can get
ugly. Which is why we need to ~strongly recommend~ the less than 30K or
rarely used javascript grouping rules somehow.

The old resource loader had the concept of 'buckets' I believe the
present resource loader just has an option to 'not-group', which is fine
since 'buckets' could be conceptualized as 'meta modules sets' that are
'not-grouped'. Not sure whats conceptually more clear. IMHO buckets is a
bit more friendly to modular extension gadget development since any
module can say its part of a given group without modifying a master or
core manifest.

At any rate, we should make sure to promote either buckets or 'meta
module' option, or it could result in painful excessive retransmission
of grouped javascrpt code.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Storing data across requests

2010-08-17 Thread Michael Dale

On 07/29/2010 10:15 AM, Bryan Tong Minh wrote:
 Hi,


 I have been working on getting asynchronous upload from url to work
 properly[1]. A problem that I encountered was that I need to store
 data across requests. Normally I would use $_SESSION, but this data
 should also be available to job runners, and $_SESSION isn't.


Could the job not include the session_id and upload_session_key .. then 
in your job handling code you just connect into that session via 
session_id( $session_id ); session_start(); to update the values ? .  
That seems like it would be more lightweight than DB status updates. .. 
I see Platonides suggested this as well.. ( that is how it was done 
originally done but with a background php process rather than jobs table 
) see 
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/HttpFunctions.php?view=markuppathrev=53825
  
line 145 ( doSessionIdDownload )

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Michael Dale

On 07/20/2010 10:24 PM, Tim Starling wrote:
 The problem is just that increasing the limits in our main Squid and
 Apache pool would create DoS vulnerabilities, including the prospect
 of accidental DoS. We could offer this service via another domain
 name, with a specially-configured webserver, and a higher level of
 access control compared to ordinary upload to avoid DoS, but there is
 no support for that in MediaWiki.

 We could theoretically allow uploads of several gigabytes this way,
 which is about as large as we want files to be anyway. People with
 flaky internet connections would hit the problem of the lack of
 resuming, but it would work for some.

yes in theory we could do that ... or we could support some simple chunk 
uploading protocol for which there is *already* basic support written, 
and will be supported in native js over time.

The firefogg protocol is almost identical to the plupload protocol. The 
main difference is firefogg requests a unique upload parameter / url 
back from the server so that if you uploaded identical named files they 
would not mangle the chunking. From a quick look at upload.php of 
plupload it appears plupload relies on the filename and a extra chunk 
url parameter != 0 request parameter. The other difference is firefogg 
has an explicit done = 1 in the request parameter to signify the end of 
chunks.

We requested feedback for adding a chunk id to the firefogg chunk 
protocol with each posted chunk to gard againt cases where the outer 
caches report an error but the backend got the file anyway. This way the 
backend can check the chunk index and not append the same chunk twice 
even if their are errors at other levels of the server response that 
cause the client to resend the same chunk.

Either way, if Tim says that plupload chunk protocol is superior then 
why discuss it? We can easily shift the chunks api to that and *move 
forward* with supporting larger file uploads. Is that at all agreeable?

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table

2010-05-28 Thread Michael Dale

More important than file_metadata and page asset metadata working with 
the same db table backed, its important that you can query export all 
the properties in the same way.

Within SMW you already have some special properties like pagelinks, 
langlinks, category properties etc, that are not stored the same as the 
other SMW page properties ...  The SMW system should name-space all 
these file_metadata properties along with all the other structured data 
available and enable universal querying / RDF exporting all the 
structured wiki data. This way file_metadata would just be one more 
special data type with its own independent tables. ...

SMW should abstract the data store so it works with the existing 
structured tables. I know this was already done for categories correct?  
Was enabling this for all the other links and usage tables explored?

This also make sense from an architecture perspective, where 
file_metadata is tied to the file asset and SMW properties are tied to 
the asset wiki description page.  This way you know you don't have to 
think about that subset of metadata properties on page updates since 
they are tied to the file asset not the wiki page propriety driven from 
structured user input. Likewise uploading a new version of the file 
would not touch the page data tables.

--michael

Markus Krötzsch wrote:
 Hi Bawolff,

 interesting project! I am currently preparing a light version of SMW that 
 does something very similar, but using wiki-defined properties for adding 
 metadata to normal pages (in essence, SMW is an extension to store and 
 retrieve page metadata for properties defined in the wiki -- like XMP for MW 
 pages; though our data model is not quite as sophisticated ;-).

 The use cases for this light version are just what you describe: simple 
 retrieval (select) and basic inverse searches. The idea is to thus have a 
 solid foundation for editing and viewing data, so that more complex functions 
 like category intersections or arbitrary metadata conjunctive queries would 
 be 
 done on external servers based on some data dump.

 It would be great if the table you design could be used for such metadata as 
 well. As you say, XMP already requires extensibility by design, so it might 
 not be too much work to achieve this. SMW properties are usually identified 
 by 
 pages in the wiki (like categories), so page titles can be used to refer to 
 them. This just requires that the meta_name field is long enough to hold MW 
 page title names. Your meta_schema could be used to separate wiki properties 
 from other XMP properties. SMW Light does not require nested structures, but 
 they could be interesting for possible extensions (the full SMW does support 
 one-level of nesting for making compound values).

 Two things about your design I did not completely understand (maybe just 
 because I don't know much about XMP):

 (1) You use mediumblob for values. This excludes range searches for numerical 
 image properties (Show all images of height 1000px or more) which do not 
 seem to be overly costly if a suitable schema were used. If XMP has a typing 
 scheme for property values anyway, then I guess one could find the numbers 
 and 
 simply put them in a table where the value field is a number. Is this use 
 case 
 out of scope for you, or do you think the cost of reading from two tables too 
 high? One could also have an optional helper field meta_numvalue used for 
 sorting/range-SELECT when it is known from the input that the values that are 
 searched for are numbers.

 (2) Each row in your table specifies property (name and schema), type, and 
 the 
 additional meta_qualifies. Does this mean that one XMP property can have 
 values of many different types and with different flags for meta_qualifies? 
 Otherwise it seems like a lot of redundant data. Also, one could put stuff 
 like type and qualifies into the mediumblob value field if they are closely 
 tied together (I guess, when searching for some value, you implicitly specify 
 what type the data you search for has, so it is not problematic to search for 
 the value + type data at once). Maybe such considerations could simplify the 
 table layout, and also make it less specific to XMP.

 But overall, I am quite excited to see this project progressing. Maybe we 
 could have some more alignment between the projects later on (How about 
 combining image metadata and custom wiki metadata about image pages in 
 queries? :-) but for GSoC you should definitely focus on your core goals and 
 solve this task as good as possible.

 Best regards,

 Markus


 On Freitag, 28. Mai 2010, bawolff wrote:
   
 Hi all,

 For those who don't know me, I'm one of the GSOC students this year.
 My mentor is ^demon, and my project is to enhance support for metadata
 in uploaded files. Similar to the recent thread on interwiki
 transclusions, I'd thought I'd ask for comments about what I propose
 to do.

 Currently metadata is stored in

Re: [Wikitech-l] Revisiting becoming an OpenID Provider

2010-05-27 Thread Michael Dale

Robb Shecter wrote:

 Consider this true
 scenario:  I want to write a MediaWiki API client for editors;
 something like the Wordpress Dashboard.  Really give editors a modern
 web experience.  I'd want to do this as a Rails app:  I could build it
 quickly and find lots of collaborators via GitHub.

Not to derail the open-id idea I think we should support oAuth 100% 
and it certainly would help with persistent applications and scalability...

But ...for the most part you can build these types of applications in 
pure javascript.  Anytime you need to run an api action that requires 
you to be on the target domain you call a bit of code to iframe proxy 
that action on the target domain and communicate its results to the 
client domain with another iframe back to the client.

mwEmbed provides iFrame proxy as part of a uniform api request system 
with the mw.getJSON() function. This that lets you just call that 
function and mwEmbed works out if it needs to spawn a proxy or if it can 
make the request directly. 

Presently I hard-code the approved domains, but it would not be 
difficult to add in process where users could approve domains / 
applications. We could even do explicit approval for the set of 
allowable api actions being requested. ( ie edit pages OK upload NO )


This has been in use for a while and its how the uploading to commons 
from the English encyclopedia page works with the add-media-wizard 
gadget. http://bit.ly/9P144i  You can test it by simply by enabling that 
gadget, then while editing click insert image, then the upload 
button, then upload to commons.

~Right now~ its a pure javascript gadget that is enabled on 
(en.wikipedia) which calls another gadget on ( commons.wikimedia ) and 
they setup two-way communication that way.  To make things more 
complicated all the javascript and html proxy pages are hosted on a 3rd 
domain ( prototype.wikimedia.org ) and its not just simple api calls, 
rather its full file uploading proxy with progress indicators and two 
way error interactions.

In the context of the mwEmbed gadget this is more complicated than it 
needs to be. I should package a apiProxy extension that could simplify 
things like having an actual proxy entry point that does not load the 
entire set of mediaWiki view page assets on every proxy interaction. 
Also it could use some HTML5 type enhancements around cross domain 
communication so the application could send and receive the msgs 
directly where the domain is approved and the browser supports it. 
Furthermore some versions of IE have to request user approval for the 
iFrame to carry user credentials, but this can be avoided with a p3p 
policy added to the response header. http://bit.ly/13kpV

That being said it has worked oky for what I needed it for, and I think 
it could be used for prototyping the editors portal as you have 
described it.

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] js2 extensions / Update ( add-media-wizard, uploadWizard, timed media player )

2010-05-21 Thread Michael Dale

Aryeh Gregor wrote:
 On Thu, May 20, 2010 at 3:20 PM, Michael Dale md...@wikimedia.org wrote:
   
 I like the idea of a user preference. This way you don't constantly have
 to add debug to the url, its easy to tests across pages and is more
 difficult for many people to accidentally invoke it.
 

 It also means that JavaScript developers will be consistently getting
 served different code from normal users, and therefore will see
 different sets of bugs.  I'm unenthusiastic about that prospect.
   
hmm ... if the minification resulted in different bugs than the raw 
code that would be a bug with the minification process and you will want 
to fix that minfication bug. You will want to know where the error 
occurred in the minfied code.

Preserving new-lines is a marginal error accessibility gain when your 
grouping many scripts, replacing all the comments with new lines, 
striping debug lines, and potentially shortening local scope variable 
names. Once you are going to fix an issue you will be fixing it in the 
actual code not the minified output, so you will need to recreate the 
bug with the non-minified output. 

In terms of all-things-being-equal compression wise using \n instead of 
semicolons consider:
a = b + c;
(d + e).print();

With \n instead of ; it will be evaluated as:
a = b + c(d + e).print();

We make wide use of parenthetical modularization, i.e all the jquery 
plugins do soemthing like:
(function($){ /* plugin code in local function scope using $ for 
jQuery */})(jQuery);
Initialization code above line with /\n/, ';' substitution will result 
in errors.


The use of a script-debug preference is for user-scripts development 
that is hosted live and developed on the server wiki pages. 
Development of core and extension javascript components should be tested 
locally in both minified and non-minified modes.

peace,
--michael



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] js2 extensions / Update ( add-media-wizard, uploadWizard, timed media player )

2010-05-20 Thread Michael Dale


Helder Geovane wrote:
 I would support a url flag to avoid minification and or avoid
 script-grouping,
 as suggested by Michael Dale, or even to have a user preference for
 enable/disable minification in a more permanent way (so we don't need to
 change the url on each test: we just disable minification, debug the code
 and then enable it again)

I like the idea of a user preference. This way you don't constantly have 
to add debug to the url, its easy to tests across pages and is more 
difficult for many people to accidentally invoke it.

Committed support for the preference in r66703

--michael



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] js2 extensions / Update ( add-media-wizard, uploadWizard, timed media player )

2010-05-17 Thread Michael Dale

The script-loader has a few modes of operation.

You can run it in raw file mode ( ie $wgEnableScriptLoader = false ). 
This will load all your javascript files directly only doing php 
requests to get the messages. In this mode php does not touch any js or 
css file your developing.

Once ready for production you can enable the script-loader it groups, 
localizes, removes debug statements, transforms css url paths, minifies 
the set of javascript / css. It includes experimental support for google 
closure compiler which does much more aggressive transformations.

I think your misunderstanding the point of the script-loader.  Existing 
extensions used on wikipedia already do a static package and minify 
javascript code.

If you want to have remote user communicate javascript debugging info, 
we could add a url flag to avoid minification and or avoid 
script-grouping.  Maybe be useful in the case of user-scripts / gadgets.

But in general its probably better / easier for end users to just 
identify their platform and whats not working, since its all code to 
them anyway. If they are a developer or are going to do something 
productive with what they are seeking they likely have the code checked 
out locally and use the debug mode.

--michael


Aryeh Gregor wrote:
 On Mon, May 17, 2010 at 6:43 PM, Maciej Jaros e...@wp.pl wrote:
   
 So does this extensions encrypt JS files into being non-debugable? I
 could understand that on sites like Facebook but on an open or even Open
 site like Wikipedia/Mediawiki? This just seems to be wrong. Simple
 concatenation of files would serve the same purpose in terms of requests
 to the server.
 

 At the very least, newlines should be preserved, so you can get a line
 number when an error occurs.  Stripping other whitespace and comments
 is probably actually be worth the performance gain, from what I've
 heard, annoying though it may occasionally be.  Stripping newlines is
 surely not worth the added debugging pain, on the other hand.
 (Couldn't you even make up for it by stripping semicolons?)

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] js2 extensions / Update ( add-media-wizard, uploadWizard, timed media player )

2010-05-13 Thread Michael Dale

If you have been following the svn commits you may have noticed a bit of 
activity on the js2 front.

I wanted to send a quick heads up that describes what is going on and 
invite people to try things out, and give feedback.

== Demos ==

The js2 extension and associated extension are ruining on sandbox-9. If 
you view the source of a main page you can see all the scripts and css 
and grouped into associated buckets:

http://prototype.wikimedia.org/sandbox.9/Main_Page

I did a (quick) port of usabilityInitiative to use the script-loader as 
well.  Notice if you click edit on a section you get all the css and 
javascript, localized in your language and delivered in a single 
request. ( I don't include the save / publish button since it was just a 
quick port )

Part of the js2 work included a wiki-text parser for javascript client 
side message transformation:
http://prototype.wikimedia.org/s-9/extensions/JS2Support/tests/testLang.html

There are a few cases out of the 356 tests were I think character 
encoding is not letting identical messages pass the test and a few 
transformations that don't match up. I will take a look at those  edge 
cases soon.

The Multimedia initiative ( Neil and Guillaume's ) UploadWizard is a js2 
/ mwEmbed based extension and also enabled on in that wiki as well: 
http://prototype.wikimedia.org/sandbox.9/Special:UploadWizard


The js2 branch of the OggHandler includes Transcode support ( so embed 
web resolution oggs when embed at web resolution in pages ) This avoids 
720P ogg videos displayed at 240 pixels wide inline ;)
http://prototype.wikimedia.org/sandbox.9/Transcode_Test

The TimedMediaHandler of course include timed text display support which 
has been seen on commons for a while http://bit.ly/aLo1pZ ...
Subtitles get looked up from commons when the repo is shared::
http://prototype.wikimedia.org/sandbox.9/File:Welcome_to_globallives_2.0.ogv

I have been working with the miro universal subtitles efforts so we 
should have an easy interface for people to contribute subtitles with soon

Edits pages of course include the add-media-wizard which as been seen as 
a remote http://bit.ly/9P144i for some time also now also works as an 
extension

== Documentation ==

Some initial JS2 extension is in extensions/JS2Support/README
Feedback on that documentation would also be helpful.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JS2 code live? (was: Uploads on small wikis)

2010-03-12 Thread Michael Dale

withJS is a special parameter in MediaWiki:Common.js that lets people 
preview mediaWiki namespace user scripts. Its been on commons for ages 
and en.wikipedia for a few weeks.

peace,
--michael

Chad wrote:
 On Fri, Mar 12, 2010 at 1:12 PM, Michael Dale md...@wikimedia.org wrote:
   
 Guillaume Paumier wrote:
 
 Just FYI, we're working on both (crosswiki-upload and 1-click crosswiki
 file move), but we're not quite there yet.

   
 As mentioned on commons list a cross site upload tool is in early /
 alpha / experimental testing:
 http://lists.wikimedia.org/pipermail/commons-l/2010-March/005335.html

 To summarize from that post you can visit:
 http://en.wikipedia.org/w/index.php?title=Wikipedia:Sandboxaction=editwithJS=MediaWiki:MwEmbed.js

 

 I haven't seen this new withJS parameter anywhere in trunk or wmf-deployment,
 only in js2-work, unless I'm being really dense today. When did this go live?

 -Chad

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] New committers

2010-03-09 Thread Michael Dale

Should we define some sort of convention for extensions to develop 
selenium tests? i.e perhaps some of the tests should live in a testing 
folder of each extension or in /trunk/testing/extensionName ? So that 
its easier to modularize the testing suite?

--michael

Roan Kattouw wrote:
 bhagya - QA engineer for Calcey Technologies, will be committing
 Selenium tests to /trunk/testing/selenium
 janesh - same

 Roan Kattouw (Catrope)

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] New Subversion committers

2010-01-13 Thread Michael Dale

Also just added Michael Shynar ( shmichael ) from Kaltura who is doing 
some add-media-wizard work.

--michael

Tim Starling wrote:
 Bawolff: various Wikinews-related extensions
 Jonathan Williford: extensions developed for http://neurov.is/on
 Ning Hu: Semantic NotifyMe

 Rob Lanphier and Conrad Irwin have been added to the core committer group.

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] mwEmbed gadget updates and improved subtitle support

2009-12-28 Thread Michael Dale

Wanted to do a quick mention of the updated mwEmbed gadget and subtitles
support on the email list(s) in case people have missed its mention on
the village pump [1] or other venues.

* On commons the Commons:Timed_Text and Template:Closed_cap have
been started.

* oggHandler has been patched to support itext [2], [3] output so that
it would not need to hit the api to get the list of available subtitles
when we are embedding locally. Remote embedding outside of wikimedia
domain grabs the up-to-date list of available tracks via an api call.

* A basic Add timed text interface is accessible from the cc button
letting you upload an srt file.

* If you enable the mwEmbed gadget on English wikipedia ( or other
wikipedias ) it will display commons subtitles in the respective
wgUserLanguage. ( Interface translations are temporarly dissabled per
bug 21947 which should be resolvable as soon as the release is branched.

* Even with extremely limited exposure the number of subtitle files is
starting to grow: [4]

* The timedText display supports inline wiki-text so persons names,
subjects and place of interest can be linked to their respective
wikpedia articles in their respective languages from the srt text. [5]

* Gadget feedback has been very good. Big thanks to all that have helped
test and especially User:84user with his very detailed reports ;)

And finally .. I imagine it ~may~ take some time before the mwEmbed
stuff makes its way through code review and on by default deployment
because of release branching and mwEmbed includes quite a few other
components ... But similar to usability beta, and other gadgets we could
let people do a much easier opt in. Which I will continue to look into
in the mean time.

peace,
michael

[1]
http://commons.wikimedia.org/wiki/Commons:Village_pump#Improved_Close_Captions_Support
[2] http://www.annodex.net/~silvia/itext/
[3] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/60458
[4]
http://commons.wikimedia.org/w/index.php?title=Special%3AAllPagesfrom=to=namespace=102
[5]
http://commons.wikimedia.org/wiki/File:Yochai_Benkler_-_On_Autonomy,_Control_and_Cultureal_Experience.ogg
**

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JS2 design (was Re: Working towards branching MediaWiki 1.16)

2009-10-01 Thread Michael Dale

... that makes sense .. ( on the side I was looking into a fall-back ogg 
video serving solution that would hit the disk issue) .. but in this 
context your right .. its about saturating the network port 

Since network ports are generally pretty fast, a test on my laptop may 
be helpful: (running PHP 5.2.6-3ubuntu4.2   Apache/2.2.11  Intel 
Centrino 2Ghz )

Lets take a big script-loader request running from memory say the 
firefogg advanced encoder javascript set
(from the trunk...I made the small modifications Tim suggested ie (don't 
parse the javascript file to get the class list)
#ab -n 1000 -c 100 
http://localhost/wiki_trunk/js2/mwEmbed/jsScriptLoader.php?urid=18class=mv_embed,window.jQuery,mvBaseUploadInterface,mvFirefogg,mvAdvFirefogg,$j.ui,$j.ui.progressbar,$j.ui.dialog,$j.cookie,$j.ui.accordion,$j.ui.slider,$j.ui.datepicker;
result is:

Concurrency Level:  100
Time taken for tests:   1.134 seconds
Complete requests:  1000
Failed requests:0
Write errors:   0
Total transferred:  64019000 bytes
HTML transferred:   63787000 bytes
Requests per second:881.54 [#/sec] (mean)
Time per request:   113.437 [ms] (mean)
Time per request:   1.134 [ms] (mean, across all concurrent requests)
Transfer rate:  55112.78 [Kbytes/sec] received

So we are hitting near 900 request per second on my 2 year old laptop. 
Now if we take the static minified combined file which is 239906 instead 
of 64019 bytes we should of-course get much higher RPS going direct to 
apache:

#ab -n 1000 -c 100 http://localhost/static_combined.js
Concurrency Level:  100
Time taken for tests:   0.604 seconds
Complete requests:  1000
Failed requests:0
Write errors:   0
Total transferred:  240385812 bytes
HTML transferred:   240073188 bytes
Requests per second:1655.18 [#/sec] (mean)
Time per request:   60.416 [ms] (mean)
Time per request:   0.604 [ms] (mean, across all concurrent requests)
Transfer rate:  388556.37 [Kbytes/sec] received

Here we get near 400MBS and around 2x times the Request per second...

At a cost of about 1/2 as many requests you can send the content to 
people 3 times as small (ie faster). Of course none of this applies to 
wikimedia setup where these would all be squid proxy hits. \

I hope this shows that we don't necessarily have to point clients to 
static files, and that php pre-processing the cache is not quite as 
costly as Tim outlined (if we setup an entry point that first checks the 
disk cache before loading in all of mediaWiki php )

Additionally most mediaWiki installs out there are probably not serving 
up thousands of request per second. (and those that are are probably 
have proxies setup).. So the gziping php proxy of js requests is worth 
while.

--michael




Aryeh Gregor wrote:
 On Wed, Sep 30, 2009 at 3:32 PM, Michael Dale md...@wikimedia.org wrote:
   
 Has anyone done any scalability studies into minimal php @readfile
 script vs apache serving the file. Obviously apache will server the file
 a lot faster but a question I have is at what file size does it saturate
 disk reads as opposed to saturated CPU?
 

 It will never be disk-bound unless the site is tiny and/or has too
 little RAM.  The files can be expected to remain in the page cache
 perpetually as long as there's a constant stream of requests coming
 in.  If the site is tiny, performance isn't a big issue (at least not
 for the site operators).  If the server has so little free RAM that a
 file that's being read every few minutes and is under a megabyte in
 size is consistently evicted from the cache, then you have bigger
 problems to worry about.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JS2 design (was Re: Working towards branching MediaWiki 1.16)

2009-09-30 Thread Michael Dale

Aryeh Gregor wrote:
 Also remember the possibility that sysops will want to include these
 scripts (conditionally or unconditionally) from MediaWiki:Common.js or
 such.  Look at the top of
 http://en.wikipedia.org/wiki/MediaWiki:Common.js, which imports
 specific scripts only on edit/preview/upload; only on watchlist view;
 only for sysops; only for IE6; and possibly others.  It also imports
 Wikiminiatlas unconditionally, it seems.  I don't see offhand how
 sysop-created server-side conditional includes could be handled, but
 it's worth considering at least unconditional includes, since sysops
 might want to split code across multiple pages for ease of editing.
   
This highlights the complexity of managing all javascript dependences on 
the server side... If possible the script-loader should dynamically 
handle these requests. For wikimedia its behind a squid proxy so should 
not be too bad. For small wikis we could setup a dedicated entry point 
that could first check the file cache key before loading all 
webstart.php, parsing javascript classes and all the other costly 
mediaWIki web engine stuff.

Has anyone done any scalability studies into minimal php @readfile 
script vs apache serving the file. Obviously apache will server the file 
a lot faster but a question I have is at what file size does it saturate 
disk reads as opposed to saturated CPU?

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] js2 coding style for html output

2009-09-28 Thread Michael Dale

My attachment did not make it into the JS2 design thread... and that 
thread is in summary mode so here is a new post around the html output 
question. Which of the following constructions are easier to read and 
understand. Is there some tab delimitation format we should use to make 
the jquery builder format easier? Are performance considerations 
relevant?  (email is probably a bad context for comparison since tabs 
will get messy and there is no syntax highlighting)

Tim suggested that in security review context dojBuild type html 
output is more strait forward to review.

I think both are useful and I like jquery style building of html since 
it gives you direct syntax errors rather than html parse errors which 
are not as predictable across browsers. But sometimes performance wise 
or from a quick get it working perspective its easier to write out an 
html string. Also I think tabbed html is a bit easier on the eyes for 
someone that has dealt a lot with html.

Something thats not fun about jquery style is there are many ways to 
build that same html string using .wrap or any of other dozen jquery 
html manipulation functions ... so the same html could be structured 
very differently in the code. Furthermore jquery chain can get pretty 
long or be made up of lots of other vars, potentially making it tricky 
to rearrange things or identify what html is coming from where.

But perhaps that could be addressed by having jquery html construction 
conventions (or a wrapper that mirrored our php side html construction 
conventions? )

In general I have used the html output style but did not really think 
about it a-priori and I am open to transitioning to more jquery style 
output.

here is the html: you can copy and paste this in... on my system Firefox 
nightly str builder hovers around 20ms while jquery builder hovers 
around 150ms (hard to say what would be a good target number of dom 
actions or what is a fair test...) ...jquery could for example output to 
a variable instead of direct to dom output shaving 10ms or so and many 
other tweaks are possible.

html
head
titleJquery vs str buider/title
script type=text/javascript 
src=http://jqueryjs.googlecode.com/files/jquery-1.3.2.min.js;/script
script type=text/javascript
var repetCount = 200;
function runTest( mode ){
$('#cat').html('');
var t0 = new Date().getTime();
if( mode =='str'){
doStrBuild();
}else{
dojBuild();
}
$('#rtime').html( (new Date().getTime() - t0)  + 'ms');
}

function doStrBuild(){
var o = '';
for(var i =0 ;i  repetCount;i++){
o+=''+
'span id=' + escape(i) + ' class=fish' +
'p class=dog rel=foo ' +
escape(i) +
'/p' +
'/span';
}
$('#cat').append(o);
}
function dojBuild(){
for(var i =0 ;i  repetCount;i++){
$('span/')
.attr({
'id': i,
'class':'fish'
})   
.append( $('p/')
.attr({
'class':'dog',
'rel' : 'foo'
})
.text( i )
).appendTo('#cat');
}
}
/script
/head
body
h3Jquery vs dom insert/h3
Run Time:span id=rtime/span/divbr
a onClick=javascript:runTest('str'); href=#Run Str/abr
a onClick=javascript:runTest('dom'); href=#Run Jquery/abr
br
div id=cat/div

/body
/html

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] js2 coding style for html output

2009-09-28 Thread Michael Dale

[snip]
 what I think we have here, is that  $('#cat') is expensive, and run
 inside a loop in dojBuild
   
you can build and append in the jquery version and it only shaves 10ms. 
ie the following still incurs the jquery html building function call costs:

function dojBuild(){
var o ='';
for(var i =0 ;i  repetCount;i++){
o+=$('span/')
.attr({
'id': i,
'class':'fish'
})   
.append( $('p/')
.attr({
'class':'dog',
'rel' : 'foo'
})
.text( i )
).html()
}
$('#cat').append(o);
}


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JS2 design (was Re: Working towards branching MediaWiki 1.16)

2009-09-28 Thread Michael Dale

Tim Starling wrote:
 Michael Dale wrote:
   
 That is part of the idea of centrally hosting reusable client-side 
 components so we control the jquery version and plugin set. So a
 new version won't come along until its been tested and
 integrated.
 

 You can't host every client-side component in the world in a
 subdirectory of the MediaWiki core. Not everyone has commit access to
 it. Nobody can hope to properly test every MediaWiki extension.

 Most extension developers write an extension for a particular site,
 and distribute their code as-is for the benefit of other users. They
 have no interest in integration with the core. If they find some
 jQuery plugin on the web that defines an interface that conflicts with
 MediaWiki, say jQuery.load() but with different parameters, they're
 not going to be impressed when you tell them that to make it work with
 MediaWiki, they need to rewrite the plugin and get it tested and
 integrated.

 Different modules should have separate namespaces. This is a key
 property of large, maintainable systems of code.
   

Right..  I agree the client side code needs more deployable modularly.

If designing a given component as a jquery plug-in, then I think it 
makes sense to put it in the jQuery namespace ... otherwise you won't be 
able to reference jquery things in a predictable way. Alternativly you


 I agree that the present system of parsing top of the javascipt
 file on every script-loader generation request is un-optimized.
 (the idea is those script-loader generations calls happen rarely
 but even still it should be cached at any number of levels. (ie
 checking the filemodifcation timestamp, witting out a php or
 serialized file .. or storing it in any of the other cache levels
 we have available, memcahce, database, etc )
 

 Actually it parses the whole of the JavaScript file, not the top, and
 it does it on every request that invokes WebStart.php, not just on
 mwScriptLoader.php requests. I'm talking about
 jsAutoloadLocalClasses.php if that's not clear.
   
Ah right... previously I had it in php. I wanted to avoid listing it 
twice but obviously thats a pretty costly way to do that.
This will make more sense to put in php if we start splitting up 
components into the extension folders and generate the path list 
dynamically for a given feature set.

 Have you looked at the profiling? On the Wikimedia app servers,
 even the simplest MW request takes 23ms, and gen=js takes 46ms. A
 static file like wikibits.js takes around 0.5ms. And that's with
 APC. You say MW on small sites is OK, I think it's slow and
 resource-intensive.

 That's not to say I'm sold on the idea of a static file cache, it
  brings its own problems, which I listed.

   
 yea... but almost all script-loader request will be cached.  it
 does not need to check the DB or anything its just a key-file
 lookup (since script-loader request pass a request key either its
 there in cache or its not ...it should be on par with the simplest
 MW request. Which is substantially shorter then around trip time
 for getting each script individually, not to mention gziping which
 can't otherwise be easily enabled for 3rd party installations.
 

 I don't think that that comparison can be made so lightly. For the
 server operator, CPU time is much more expensive than time spent
 waiting for the network. And I'm not proposing that the client fetches
 each script individually, I'm proposing that scripts be concatentated
 and stored in a cache file which is then referenced directly in the HTML.
   

I understand. We could even check gziping support at page output time 
and point to the gziped cached versions (analogous to making direct 
links to the /script-cache folder of the of the present script-loader 
setup )

My main question is how will this work for dynamic groups of scripts set 
post page load that are dictated by user interaction or client state?

Its not as easy to setup static combined output files to point to when 
you don't know what set of scripts you will be requesting ahead of time.

 $wgSquidMaxage is set to 31 days (2678400 seconds) for all wikis
 except wikimediafoundation.org. It's necessary to have a very long
 expiry time in order to fill the caches and achieve a high hit rate,
 because Wikimedia's access pattern is very broad, with the long tail
 dominating the request rate.
   
oky... so to preserve high cache level you could then have a single 
static file that lists versions of js with a low expire and the rest 
with high expire? Or maybe its so cheep to serve static files that it 
does not mater and just leave everything with a low expire?

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JS2 design (Read this Not Previous)

2009-09-28 Thread Michael Dale

~ dough ~ Disregard previous, bad key stroke sent rather than save to draft.

Tim Starling wrote:
 Michael Dale wrote:
   
 That is part of the idea of centrally hosting reusable client-side 
 components so we control the jquery version and plugin set. So a
 new version won't come along until its been tested and
 integrated.
 

 You can't host every client-side component in the world in a
 subdirectory of the MediaWiki core. Not everyone has commit access to
 it. Nobody can hope to properly test every MediaWiki extension.

 Most extension developers write an extension for a particular site,
 and distribute their code as-is for the benefit of other users. They
 have no interest in integration with the core. If they find some
 jQuery plugin on the web that defines an interface that conflicts with
 MediaWiki, say jQuery.load() but with different parameters, they're
 not going to be impressed when you tell them that to make it work with
 MediaWiki, they need to rewrite the plugin and get it tested and
 integrated.

 Different modules should have separate namespaces. This is a key
 property of large, maintainable systems of code.
   

Right..  I agree the client side code needs more deployable modularly. 
It just tricky to manage all those relationships in php, but it appears 
it will be necessary to do so...

If designing a given component as a jQuery plug-in, then I think it 
makes sense to put it in the jQuery namespace ... otherwise you won't be 
able to reference jQuery things locally and no-conflict compatible way.  
Unless we create a mw wrapper of some sorts but I don't know how 
necessary that is atm... i guess it would be slightly cleaner.


 I agree that the present system of parsing top of the javascipt
 file on every script-loader generation request is un-optimized.
 (the idea is those script-loader generations calls happen rarely
 but even still it should be cached at any number of levels. (ie
 checking the filemodifcation timestamp, witting out a php or
 serialized file .. or storing it in any of the other cache levels
 we have available, memcahce, database, etc )
 

 Actually it parses the whole of the JavaScript file, not the top, and
 it does it on every request that invokes WebStart.php, not just on
 mwScriptLoader.php requests. I'm talking about
 jsAutoloadLocalClasses.php if that's not clear.
   
Ah right... previously I had it in php. I wanted to avoid listing it 
twice but obviously thats a pretty costly way to do that.
This will make more sense to put in php if we start splitting up 
components into the extension folders and generate the path list 
dynamically for a given feature set.

 Have you looked at the profiling? On the Wikimedia app servers,
 even the simplest MW request takes 23ms, and gen=js takes 46ms. A
 static file like wikibits.js takes around 0.5ms. And that's with
 APC. You say MW on small sites is OK, I think it's slow and
 resource-intensive.

 That's not to say I'm sold on the idea of a static file cache, it
  brings its own problems, which I listed.

   
 yea... but almost all script-loader request will be cached.  it
 does not need to check the DB or anything its just a key-file
 lookup (since script-loader request pass a request key either its
 there in cache or its not ...it should be on par with the simplest
 MW request. Which is substantially shorter then around trip time
 for getting each script individually, not to mention gziping which
 can't otherwise be easily enabled for 3rd party installations.
 

 I don't think that that comparison can be made so lightly. For the
 server operator, CPU time is much more expensive than time spent
 waiting for the network. And I'm not proposing that the client fetches
 each script individually, I'm proposing that scripts be concatentated
 and stored in a cache file which is then referenced directly in the HTML.
   

I understand. (its analogous to making direct links to the /script-cache 
folder instead of requesting the files through the script-loader entry 
point )

My main question is how will this work for dynamic groups of scripts set 
post page load that are dictated by user interaction or client state?

Do we just ignore this possibly and grab any necessary module components 
based on pre-defined module sets in php that get passed down to javascript?

Its not as easy to setup static combined output files to point to when 
you don't know what set of scripts you will be requesting...

hmm... if we had a predictable key format we could do a request for the 
static file. if we get a 404 then we do a request a dynamic request to 
generate the static file?.. Subsequent interactions would hit that 
static file? that seems ugly though.

 $wgSquidMaxage is set to 31 days (2678400 seconds) for all wikis
 except wikimediafoundation.org. It's necessary to have a very long
 expiry time in order to fill the caches and achieve a high hit rate,
 because Wikimedia's access pattern is very broad, with the long tail

Re: [Wikitech-l] JS2 design (was Re: Working towards branching MediaWiki 1.16)

2009-09-28 Thread Michael Dale

we have js2AddOnloadHook that gives you jquery in no conflict as $j 
variable the idea behind using a different name is to separate jquery 
based code from the older non-jquery based code... but if taking a more 
iterative approach we could replace the addOnloadHook function.

--michael

Daniel Friesen wrote:
 I got another, not from the thread of course. I'd like addOnloadHook to
 be replaced by jQuery's ready which does a much better job of handling
 load events.

 ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

 Tim Starling wrote:
   
 Here's what I'm taking out of this thread:

 * Platonides mentions the case of power-users with tens of scripts loaded via
 gadgets or user JS with importScript().
 * Tisza asks that core onload hooks and other functions be overridable by 
 user JS.
 * Trevor and Michael both mention i18n as an important consideration which I
 have not discussed.
 * Michael wants certain components in the js2 directory to be usable as
 standalone client-side libraries, which operate without MediaWiki or any 
 other
 server-side application.

 -- Tim Starling
   
 


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] JS2 design (was Re: Working towards branching MediaWiki 1.16)

2009-09-25 Thread Michael Dale


thanks for the constructive response :) ... comments inline

Tim Starling wrote:
I agree we should move things into a global object ie: $j and all our 
components / features should extend that object. (like jquery plugins). 
That is the direction we are already going.



I think it would be better if jQuery was called window.jQuery and
MediaWiki was called window.mw. Then we could share the jQuery
instance with JS code that's not aware of MediaWiki, and we wouldn't
need to worry about namespace conflicts between third-party jQuery
plugins and MediaWiki.
  
Right but there are benefits to connecting into the jQuery plugin system 
that would not be as clean to wrap into our window.mw object. For 
example $('#textbox').wikiEditor() is using jQuery selectors for the 
target, and maybe other jQuery plugin conventions like the jquery class 
alias inside the function(){})(jQuery);


Although if not designing your tool as a jQuery plugin then yea ;) ... 
but I think most of the tools should be designed as jQuery plug-ins.


Dependency loading is not really beyond the scope... we are already 
supporting that. If you check out the mv_jqueryBindings function in 
mv_embed.js ... here we have loader calls integrated into the jquery 
binding. This integrates loading the high level application interfaces 
into their interface call.



Your so-called dependency functions (e.g. doLoadDepMode) just seemed
to be a batch load feature, there was no actual dependency handling.
Every caller was required to list the dependencies for the classes it
was loading.
  


I was referring to defining the dependencies in the module call ... ie 
$j('target').addMediaWiz( config ) and having the addMediaWiz module map 
out the dependencies in the javascript. doLoadDepMode just lets you get 
around an IE bug that when inserting scripts via the dom you have no 
gurantee one script will execute in the order inserted. If you your 
conncatinaging your scripts doLoadDepMode would not be needed as order 
will be preserved in the concatenated file.


I like mapping out the dependencies in javascript at that module level 
since it makes it easier to do custom things like read the passed in 
configuration and decide which dependencies you need to fulfill. If not 
you have to define many dependency sets in php or have much more 
detailed model of your javscript inside php.


But I do understand that it will eventually result in lots of extra 
javascript module definitions that the given installation may not want. 
So perhaps we generate that module definition via php configuration ... 
or we define the set of javascript files to include that define the 
various module loaders we want with a given configuration.


This is sort the approach taken with the wikiEditor that has a few thin 
javascript files that make calls to add modules (like add-sidebar) to a 
core component  (wikiEditor).  That way the feature set can be 
controlled by the php configuration while retaining runtime flexibility 
for dependence mapping.


The idea is to move more and more of the structure of the application 
into that system. so right now mwLoad is a global function but should be 
re-factored into the jquery space and be called via $j.load();  |

|



That would work well until jQuery introduced its own script-loader
plugin with the same name and some extension needed to use it.


  


That is part of the idea of centrally hosting reusable client-side 
components so we control the jquery version and plugin set. So a new 
version won't come along until its been tested and integrated.


If the function does mediawiki specifc scriptloader load stuff then yea 
it should be called mwLoad or what not. If some other plugin or native 
jquery piece comes along we can just have our plugin override it and or 
store the native as a parent (if its of use) ... if that ever happens...



We could add that convention directly into the script-loader function if 
desired so that on a per class level we include dependencies. Like 
mwLoad('ui.dialog') would know to load ui.core etc.



Yes, that is what real dependency handling would do.
  


Thinking about this more ... I think its a bad idea to exclusively put 
the dependency mapping in php.  It will be difficult to avoid 
re-including the same things in client side loading chains. Say you have 
your suggest search system once the user starts typing we load 
jquery.suggest it knows that it needs jquery ui via dependency mapping 
stored in php. It sends both ui and suggest to the client. Now the user 
in the same page instance decides instead to edit a section. The 
editTool script-loader gets called its dependencies also include 
jquery.ui. How will the dependency-loader script-server know that the 
client already has the jquery.ui dependency from the suggest tool?


In the end you need these dependencies mapped out in the JS so that the 
client can intelligibly request the script set it needs.  In that same 
example if the

Re: [Wikitech-l] JS2 design (was Re: Working towards branching MediaWiki 1.16)

2009-09-24 Thread Michael Dale

~some comments inline~

Tim Starling wrote:

[snip]
 I started off working on fixing the coding style and the most glaring
 errors from the JS2 branch, but I soon decided that I shouldn't be
 putting so much effort into that when a lot of the code would have to
 be deleted or rewritten from scratch.
   

I agree there are some core components that should be separated out and 
re-factored. And some core pieces that your probably focused on do need 
to be removed  rewritten as they are aged quite a bit. (parts of 
mv_embed.js where created in SOC 06) ... I did not focus on the ~best~ 
core loader that could have been created I have just built on what I 
already had available that has worked reasonably well for the 
application set that I was targeting. Its been an iterative process 
which I feel is moving in the right direction as I will outline below.

Obviously more input is helpful and I am open to implementing most of 
the changes you describe as they make sense. But exclusion and dismissal 
may not be less helpful... unless that is your targeted end in which 
case just say so ;)

Its normal for 3rd party observer to say the whole system should be 
scraped and rewritten. Of course starting from scratch is much easier to 
design an ideal system and what it should/could be.

 I did a survey of script loaders in other applications, to get an idea
 of what features would be desirable. My observations came down to the
 following:

 * The namespacing in Google's jsapi is very nice, with everything
 being a member of a global google object. We would do well to
 emulate it, but migrating all JS to such a scheme is beyond the scope
 of the current project.
   

You somewhat contradict this approach by recommending against class 
abstraction below.. ie how will you cleanly load components and 
dependencies if not by a given name?

I agree we should move things into a global object ie: $j and all our 
components / features should extend that object. (like jquery plugins). 
That is the direction we are already going.

Dependency loading is not really beyond the scope... we are already 
supporting that. If you check out the mv_jqueryBindings function in 
mv_embed.js ... here we have loader calls integrated into the jquery 
binding. This integrates loading the high level application interfaces 
into their interface call.

The idea is to move more and more of the structure of the application 
into that system. so right now mwLoad is a global function but should be 
re-factored into the jquery space and be called via $j.load();  |
|
 * You need to deal with CSS as well as JS. All the script loaders I
 looked at did that, except ours. We have a lot of CSS objects that
 need concatenation, and possibly minification.
   

Brion did not set that as high priority when I inquired about it, but of 
course we should add in style grouping as well. It's not like I said we 
should exclude that in our script-loader just a matter of setting 
priority which I agree is high priority.
 * JS loading can be deferred until near the /body or until the
 DOMContentLoaded event. This means that empty-cache requests will
 render faster. Wordpress places emphasis on this.
   

true. I agree that we should put the script includes at the bottom. Also 
all non-core js2 scripts is already loaded via DOMContentLoaded ready 
event. Ideally we should only provide loaders and maybe some small bit 
of configuration for the client side applications they provide. As 
briefly described here: 
http://www.mediawiki.org/wiki/JS2_Overview#How_to_structure_your_JavaScript_application
 * Dependency tracking is useful. The idea is to request a given
 module, and all dependencies of that module, such as other scripts,
 will automatically be loaded first.
   

As mentioned above we do some dependency tracking via binding jquery 
helpers that do that setup internally on a per application interface level.
We could add that convention directly into the script-loader function if 
desired so that on a per class level we include dependencies. Like 
mwLoad('ui.dialog') would know to load ui.core etc.


 I then looked more closely at the current state of script loading in
 MediaWiki. I made the following observations:

 * Most linked objects (styles and scripts) on a typical page view come
 from the Skin. If the goal is performance enhancement, then working on
 the skins and OutputPage has to be a priority.
   

agreed. The script-loading was more urgent for my application task set. 
But for the common case of per page view performance css grouping has 
bigger wins.
 * The class abstraction as implemented in JS2 has very little value
 to PHP callers. It's just as easy to use filenames. 
The idea with class abstraction is that you don't know what script set 
you have available at any given time. Maybe one script included 
ui.resizable and ui.move and now your script depends on  ui.resizable 
and ui.move and ui.drag... your loader call will only include ui.drag 
(since the

Re: [Wikitech-l] Working towards branching MediaWiki 1.16

2009-09-23 Thread Michael Dale

I would add that I am of course open to reorganization and would happily 
discuss why any given decision was made ... be it trade offs with other 
ways of doing things or lack of time to do it differently / better. 

I also add that not all the legacy support and metavid based code has 
been factored out. (for example for a while I supported the form based 
upload but now that the upload api is in place I should remove that old 
code) Other things like timed text support are barely supported because 
of lack of time. But I would want to keep the skeleton of timed text in 
there so once we do get around to adding timed text for video we have a 
basis to move forward from.

I suggest for a timely release that you strip the js2 folder and make a 
note that the configuration variable can not be turned on in this 
release. And help me identify any issues that need to be addressed for 
inclusion the next release?

And finally, the basic direction and feature set was proposed on this 
list quite some time ago and ~some~ feedback was given at the time.

I would also would echo Trevor's call for more discussion with affected 
parties if your proposing significant changes.

peace,
--michael


Trevor Parscal wrote:
 On 9/22/09 6:19 PM, Tim Starling wrote:
   
 Siebrand Mazeland wrote:

 
 Hola,

 I just created https://bugzilla.wikimedia.org/show_bug.cgi?id=20768 (Branch
 1.16) and Brion was quick to respond that some issues with js2 and the
 new-upload stuff need to be ironed out; valid concerns, of course.

 I proposed to make bug 20768 a tracking bug, so that it can be made visible
 what issues are to/could be considered blocking something we can make a 1.16
 out of.

 Let the dependency tagging begin. Users of MediaWiki trunk are encouraged to
 report each and every issue, so that what is known can also be resolved
 (eventually).

 I'm calling on all volunteer coders to keep an eye on this issue and to help
 out fixing issues that are mentioned here.
  
   
 I've been working on a rewrite of the script loader and a
 reorganisation of the JS2 stuff. I'd like to delay 1.16 until that's
 in and tested. Brion has said that he doesn't want Michael Dale's
 branch merge reverted, so as far as I can see, a schedule delay is the
 only other way to maintain an appropriate quality.

 -- Tim Starling


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

 
 If you are really doing a JS2 rewrite/reorganization, would it be 
 possible for some of us (especially those of us who deal almost 
 exclusively with JavaScript these days) to get a chance to ask 
 questions/give feedback/help in general?

 While I think a rewrite/reorganization could be really awesome if done 
 right, and also that getting it right will be easier if we can get some 
 interested parties informed/consulted.

 I know that Michael Dale's work was more or less done outside of the 
 general MediaWiki branch for the majority of it's development, and afaik 
 it has been a work in progress for some time, so I feel that such a 
 golden opportunity has never really come up before.

 Aside from my own desire to be involved at some level, it seems fitting 
 to have some sort of discussion at times like these so we can make sure 
 we are making the best decisions about software before it's deployed - 
 as making changes to deployed software is seems to often be much more 
 difficult.

 Perhaps there's a MediaWiki page, or a time on IRC, or even just 
 continuing on this list...?

 My first question is: What are you changing and how, and what are you 
 moving and where?

 - Trevor

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Video transcoding settings Was: [54611] trunk/extensions/WikiAtHome/WikiAtHome.php

2009-08-14 Thread Michael Dale

yea was using the wrong version of ffmpeg2theora locally ;)... Thanks 
for the reminder, updated our ffmpeg2theora encode command in r55042 ... 
an update to firefogg should support the --buf-delay argument shortly as 
well.

--michael

Gregory Maxwell wrote:
 On Fri, Aug 7, 2009 at 5:29 PM, d...@svn.wikimedia.org wrote:
   
 http://www.mediawiki.org/wiki/Special:Code/MediaWiki/54611

 Revision: 54611
 Author:   dale
 Date: 2009-08-07 21:29:26 + (Fri, 07 Aug 2009)

 Log Message:
 ---
 added a explicit keyframeInterval per gmaxwell's mention on wikitech-l. (I 
 get ffmpeg2theora: unrecognized option `--buf-delay for adding in buf-delay)
 


 I thought firefogg was tracking j^'s nightly?  If the encoder has
 two-pass it has --buf-delay. Does firefog perhaps need to be changed
 to expose it?

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Video Quality for Derivatives (was Re:w...@home Extension)

2009-08-06 Thread Michael Dale

So I committed ~basic~ derivate code support for oggHandler in r54550 
(more solid support on the way)

Based input from the w...@home thread;  here are updated target 
qualities expressed via the firefogg api to ffmpeg2thoera

Also j^ was kind enough to run these settings on some sample input files:
http://firefogg.org/j/encoding_samples/ so you can check them out there.

We want to target 400 wide for the web stream to be consistent with 
archive.orgs which encodes mostly to 400x300 (although their 16:9 stuff 
can be up to 530 wide) ...

Updated mediaWiki firefogg integration and the stand alone encoder app 
these default transcode settings. in r54552  r54554 (should be pushed 
out to http://firefogg.org/make shortly ... or can be run @home with a 
trunk check out at:
/js2/mwEmbed/example_usage/Firefogg_Make_Advanced.html

anyway on to the settings:

$wgDerivativeSettings[ WikiAtHome::ENC_SAVE_BANDWITH ] =
array(
'maxSize'= '200',
'videoBitrate'= '164',
'audioBitrate'= '32',
'samplerate'= '22050',
'framerate'= '15',
'channels'= '1', 
'noUpscaling'= 'true'
);
$wgDerivativeSettings[ WikiAtHome::ENC_WEB_STREAM ] =
array(
'maxSize'= '400',
'videoBitrate'= '544',
'audioBitrate'= '96',
'noUpscaling'= 'true'
);
$wgDerivativeSettings[ WikiAtHome::ENC_HQ_STREAM ] =
array(
'maxSize' = '1080',
'videoQuality'= 6,
'audioQuality'= 3,
'noUpscaling'= 'true'
);

--michael


Brion Vibber wrote:
 On 8/3/09 9:56 PM, Gregory Maxwell wrote:
 [snip]
   
 Based on 'what other people do' I'd say the low should be in the
 200kbit-300kbit/sec range.  Perhaps taking the high up to a megabit?

 There are also a lot of very short videos on Wikipedia where the whole
 thing could reasonably be buffered prior to playback.


 Something I don't have an answer for is what resolutions to use. The
 low should fit on mobile device screens.
 

 At the moment the defaults we're using for Firefogg uploads are 400px 
 width (eg, 400x300 or 400x225 for the most common aspect rations) 
 targeting a 400kbps bitrate. IMO at 400kbps at this size things don't 
 look particularly good; I'd prefer a smaller size/bitrate for 'low' and 
 higher size/bitrate for medium qual.


  From sources I'm googling up, looks like YouTube is using 320x240 for 
 low-res, 480x360 h.264 @ 512kbps+128kbps audio for higher-qual, with 
 720p h.264 @ 1024Kbps+232kbps audio available for some HD videos.

 http://www.squidoo.com/youtubehd

 These seem like pretty reasonable numbers to target; offhand I'm not 
 sure the bitrates used for the low-res version but I think that's with 
 older Flash codecs anyway so not as directly comparable.

 Also, might we want different standard sizes for 4:3 vs 16:9 material?

 Perhaps we should wrangle up some source material and run some test 
 compressions to get a better idea what this'll look like in practice...

   
 Normally I'd suggest setting
 the size based on the content: Low motion detail oriented video should
 get higher resolutions than high motion scenes without important
 details. Doubling the number of derivatives in order to have a large
 and small setting on a per article basis is probably not acceptable.
 :(
 

 Yeah, that's way tougher to deal with... Potentially we could allow some 
 per-file tweaks of bitrates or something, but that might be a world of 
 pain. :)

   
 As an aside— downsampled video needs some makeup sharpening like
 downsampled stills will. I'll work on getting something in
 ffmpeg2theora to do this.
 

 Woohoo!

   
 There is also the option of decimating the frame-rate. Going from
 30fps to 15fps can make a decent improvement for bitrate vs visual
 quality but it can make some kinds of video look jerky. (Dropping the
 frame rate would also be helpful for any CPU starved devices)
 

 15fps looks like crap IMO, but yeah for low-bitrate it can help a lot. 
 We may wish to consider that source material may have varying frame 
 rates, most likely to be:

 15fps - crappy low-res stuff found on internet :)
 24fps / 23.98 fps - film-sourced
 25fps - PAL non-interlaced
 30fps / 29.97 fps - NTSC non-interlaced or many computer-generated vids
 50fps - PAL interlaced or PAL-compat HD native
 60fps / 59.93fps - NTSC interlaced or HD native

 And of course those 50 and 60fps items might be encoded with or without 
 interlacing. :)

 Do we want to normalize everything to a standard rate, or maybe just cut 
 50/60 to 25/30?

 (This also loses motion data, but not as badly as decimation to 15fps!)

   
 This brings me to an interesting point about instant gratification:
 Ogg was intended from day one to be a streaming format. This has
 pluses and minuses, but one thing we should take

Re: [Wikitech-l] Wiki at Home Extension

2009-08-03 Thread Michael Dale

Google's cost is probably more on the distribution side of things ... 
but I only found a top level number not a break down of component costs. 
At any rate the point is to start exploring distributing costs 
associated with large scale video collaboration. In that way I target 
developing a framework where individual pieces can be done on the server 
or on clients depending on what is optimal. Its not that much extra 
effort to design things this way.

Look back 2 years and you can see the xiph communities blog posts and 
conversations with Mozilla. It was not a given that Firefox would ship 
with ogg theora baseline video support (they took some convening and had 
to do some thinking about it, a big site like wikipedia exclusively 
using the free formats technology probably helped their decision). 
Originally the xiph/annodex community built the liboggplay library as an 
extension. This later became the basis for the library that powers 
firefox ogg theora video today. Likewise we are putting features into 
firefogg that we eventually hope will be supported by browsers natively. 
Also in theory we could put a thin bittorrent client into java Cortado 
to support IE users as well.

peace,
--michael

Tisza Gergő wrote:
 Michael Dale mdale at wikimedia.org writes:

   
 * We are not Google. Google lost what like ~470 million~ last year on 
 youtube ...(and that's with $240 million in advertising) so total cost 
 of $711 million [1]
 

 How much of that is related to transcoding, and how much to delivery? You seem
 to be conflating the two issues. We cannot do much to cut delivery costs, save
 for serving less movies to readers - distributed transcoding would actually
 raise them. (Peer-to-peer video distribution sounds like a cool feature, but 
 it
 needs to be implemented by browser vendors, not Wikimedia.)


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki at Home Extension

2009-08-03 Thread Michael Dale

yea would have to be opt in. Would have to have controls over how-much 
bandwidth sent out...  We could encourage people to enable it by sending 
out a the higher bit-rate / quality version ~by default~ for those that 
opt-in.

--michael


Ryan Lane wrote:
 On Mon, Aug 3, 2009 at 1:57 PM, Michael Dalemd...@wikimedia.org wrote:
   
 Look back 2 years and you can see the xiph communities blog posts and
 conversations with Mozilla. It was not a given that Firefox would ship
 with ogg theora baseline video support (they took some convening and had
 to do some thinking about it, a big site like wikipedia exclusively
 using the free formats technology probably helped their decision).
 Originally the xiph/annodex community built the liboggplay library as an
 extension. This later became the basis for the library that powers
 firefox ogg theora video today. Likewise we are putting features into
 firefogg that we eventually hope will be supported by browsers natively.
 Also in theory we could put a thin bittorrent client into java Cortado
 to support IE users as well.

 

 If watching video on Wikipedia requires bittorrent, most corporate
 environments are going to be locked out. If a bittorrent client is
 loaded by default for the videos, most corporate environments are
 going to blacklist wikipedia's java apps.

 I'm not saying p2p distributed video is a bad idea, and the Wikimedia
 foundation may not care about how corporate environments react;
 however, I think it is a bad idea to either force users to use a p2p
 client, or make them opt-out.

 Ignoring corporate clients... firing up a p2p client on end-user's
 systems could cause serious issues for some. What if I'm browsing on a
 3g network, or a satellite connection where my bandwidth is metered?

 Maybe this is something that could be delivered via a gadget and
 enabled in user preferences?

 V/r,

 Ryan Lane

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] w...@home Extension

2009-08-03 Thread Michael Dale

perhaps if people create a lot of voice overs  ~Kens burns~ effects on 
commons images with the occasional inter-spliced video clip with lots of 
back and forth editing... and we are constantly creating timely 
derivatives of these flattened sequences that ~may~ necessitate such a 
system.. because things will be updating all the time ...

... but anyway... yea for now will focus on flattening sequences...

Did a basic internal encoder committed in r54340... Could add some 
enhancements but lets spec out want we want ;)

Still need to clean up the File:myFile.mp4 situation. Probably store in 
a temp location write out a File:myFile.ogg placeholder then once 
transcoded swap it in?

Also will hack in adding derivatives to the job queue where oggHandler 
is embed in a wiki-article at a substantial lower resolution than the 
source version. Will have it send the high res version until the 
derivative is created then purge the pages to point to the new 
location. Will try and have the download link still point to the high 
res version. (we will only create one or two derivatives... also we 
should decide if we want an ultra low bitrate (200kbs or so version for 
people accessing Wikimedia on slow / developing country connections)

peace,
michael

Brion Vibber wrote:
 On 7/31/09 6:51 PM, Michael Dale wrote:
   
 Want to point out the working prototype of the w...@home extension.
 Presently it focuses on a system for transcoding uploaded media to free
 formats, but will also be used for flattening sequences and maybe
 other things in the future ;)
 

 Client-side rendering does make sense to me when integrated into the 
 upload and sequencer processes; you've got all the source data you need 
 and local CPU time to kill while you're shuffling the bits around on the 
 wire.

 But I haven't yet seen any evidence that a distributed rendering network 
 will ever be required for us, or that it would be worth the hassle of 
 developing and maintaining it.


 We're not YouTube, and don't intend to be; we don't accept everybody's 
 random vacation videos, funny cat tricks, or rips from Cartoon 
 Network... Between our licensing requirements and our limited scope -- 
 educational and reference materials -- I think we can reasonably expect 
 that our volume of video will always be *extremely* small compared to 
 general video-sharing sites.

 We don't actually *want* everyone's blurry cell-phone vacation videos of 
 famous buildings (though we might want blurry cell-phone videos of 
 *historical events*, as with the occasional bit of interesting news 
 footage).

 Shooting professional-quality video suitable for Wikimedia use is 
 probably two orders of magnitude harder than shooting attractive, useful 
 still photos. Even if we make major pushes on the video front, I don't 
 think we'll ever have the kind of mass volume that would require a 
 distributed encoding network.

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki at Home Extension

2009-08-02 Thread Michael Dale

two quick points.
1) you don't have to re-upload the whole video just the sha1 or some 
sort of hash of the assigned chunk.
2) should be relatively strait froward to catch abuse via assigned user 
id's to each chunk uploaded. But checking the sha1 a few times from 
other random clients that are encoding other pieces would make abuse 
very difficult... at the cost of a few small http requests after the 
encode is done, and at a cost of slightly more CPU cylces of the 
computing pool. But as this thread has pointed out CPU cycles are much 
cheaper than bandwidth bits or humans time patrolling derivatives.

We have the advantage with a system like Firefogg that we control the 
version of the encoder pushed out to clients via auto-update and check 
that before accepting their participation (so sha1s should match if the 
client is not doing anything fishy)

But these are version 2 type features conditioned on 1) Bandwidth being 
cheep and internal computer system maintenance and acquisition being 
slightly more costly. (and or 2) We probably want to integrating a thin 
bittorrent client into firefogg so we hit the sending out the source 
footage only once upstream cost ratio.

We need to start exploring the bittorrent integration anyway to 
distribute the bandwidth cost on the distribution side. So this work 
would lead us in a good direction as well.

peace,
--michael

Tisza Gergő wrote:
 Steve Bennett stevagewp at gmail.com writes:

   
 Why are we suddenly concerned about someone sneaking obscenity onto a
 wiki? As if no one has ever snuck a rude picture onto a main page...
 

 There is a slight difference between vandalism that shows up in recent changes
 and one that leaves no trail at all except maybe in log files only accessible
 for sysadmins.


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki at Home Extension

2009-08-02 Thread Michael Dale

Lets see...

* all these tools will be needed for flattening sequences anyway. In
that case CPU costs are really really high like 1/5 or lower real-time
and the number of computation needed explodes much faster as every
stable edit necessitates a new flattening of some portion of the
sequence.

* I don't think its possible to scale the foundation's current donation
model to traditional free net video distribution.

* We are not Google. Google lost what like ~470 million~ last year on
youtube ...(and that's with $240 million in advertising) so total cost
of $711 million [1] say we manage to do 1/100th of youtube ( not
unreasonable consider we are a top 4 site. Just imagine a world where
you watch one wikipedia video for every 100 you watch on youtube ) ...
then we would be what like what 7x the total budget ? ( and they are not
supporting video editing with flattening of sequences ) ... The pirate
bay on the other hand operates at a technology cost comparable to
wikimedia (~$3K~ a month in bandwidth) and is distributing like 1/2 of
the nets torrents? [2]. (obviously these numbers are a bit of tea
leaf reading but give or take an order of magnitude it should still be
clear which model we should be moving towards.)

... I think its good to start thinking about p2p distribution and
computation ... even if we are not using it today ...

* I must say I don't quite agree with your proposed tactic to retain
neutral networks by avoiding bandwidth distribution via peer 2 peer
technology. I am aware the net is not built for p2p nor is it very
efficient vs CDNs ... but the whole micro payment system never paned out
... Perhaps your right p2p will just give companies an excuse to
restructure the net in a non network neutral way... but I think they
already have plenty excuse with the existing popular bittorrent systems
and don't see another way other way for not-for-profit net communities
to distribute massive amounts of video to each-other.

* I think you may be blowing this ~a bit~ outside of proportion calling
into question foundation priority in the context of this hack. If this
was a big initiative over the course of a year or a initiative over the
course of more than part-time over a week ~ ... then it would make more
sense to worry about this. But in its present state its just a quick
hack and the starting point of conversation not foundation policy or
initiative.

peace,
michael

[1]
http://www.ibtimes.com/articles/20090413/alleged-470-million-youtube-loss-will-be-cleared-week.htm
[2]
http://newteevee.com/2009/07/19/the-pirate-bay-distributing-the-worlds-entertainment-for-3000-a-month/

Gregory Maxwell wrote:
On Sun, Aug 2, 2009 at 6:29 PM, Michael Dalemd...@wikimedia.org wrote:
[snip]

two quick points.
1) you don't have to re-upload the whole video just the sha1 or some
sort of hash of the assigned chunk.

But each re-encoder must download the source material.

I agree that uploads aren't much of an issue.

[snip]

other random clients that are encoding other pieces would make abuse
very difficult... at the cost of a few small http requests after the
encode is done, and at a cost of slightly more CPU cylces of the
computing pool.

Is 2x slightly? (Greater because some clients will abort/fail.)

Even that leaves open the risk that a single trouble maker will
register a few accounts and confirm their own blocks. You can fight
that too— but it's an arms race with no end. I have no doubt that the
problem can be made tolerably rare— but at what cost?

I don't think it's all that acceptable to significantly increase the
resources used for the operation of the site just for the sake of
pushing the capital and energy costs onto third parties, especially
when it appears that the cost to Wikimedia will not decrease (but
instead be shifted from equipment cost to bandwidth and developer
time).

[snip]

We need to start exploring the bittorrent integration anyway to
distribute the bandwidth cost on the distribution side. So this work
would lead us in a good direction as well.

http://lists.wikimedia.org/pipermail/wikitech-l/2009-April/042656.html

I'm troubled that Wikimedia is suddenly so interested in all these
cost externalizations which will dramatically increase the total cost
but push those costs off onto (sometimes unwilling) third parties.

Tech spending by the Wikimedia Foundation is a fairly small portion of
the budget, enough that it has drawn some criticism. Behaving in the
most efficient manner is laudable and the WMF has done excellently on
this front in the past. Behaving in an inefficient manner in order to
externalize costs is, in my view, deplorable and something which
should be avoided.

Has some organizational problem arisen within Wikimedia which has made
it unreasonably difficult to obtain computing resources, but easy to
burn bandwidth and development time? I'm struggling to understand

Re: [Wikitech-l] w...@home Extension

2009-08-01 Thread Michael Dale

Some notes:
* ~its mostly an api~. We can run it internally if that is more cost 
efficient. ( will do on a command line client shortly ) ... (as 
mentioned earlier the present code was hacked together quickly its just 
a prototype. I will generalize things to work better as internal jobs. 
and I think I will not create File:Myvideo.mp4 wiki pages rather create 
a placeholder File:Myvideo.ogg page and only store the derivatives 
outside of wiki page node system. I also notice some sync issues with 
oggCat which are under investigation )

* Clearly CPU's, are cheep so is power for the commuters, human 
resources for system maintenance, rack-space and internal network 
management, and we of-course will want to run the numbers on any 
solution we go with. I think your source bitrate assumption was a little 
high I would think more like 1-2Mbs (with cell-phone camaras targeting 
low bitrates for transport and desktops re-encoding before upload). But 
I think this whole convesation is missing the larget issue which is if 
its cost prohibitive to distribute a few copies for transcode how are we 
going to distribute the derivatives thousands of times for viewing?  
Perhaps future work in this area should focus more on the distributing  
bandwith cost issue.

*  Furthermore I think I might have mis-represented w...@home I should 
have more clearly focused on the sequence flattening and only mentioned 
transocding as an option. With sequence flattening we have a more 
standard viewing bitrate of source material and cpu costs for rendering 
are much higher. At present there is no fast way to overlay html/svg on 
video with filters and effects that are only presently predictably 
defined in javascript. For this reason we use the browser to wysiwyg 
render out the content. Eventually we may want to write a optimized 
stand alone flattener, but for now the w...@home solution worlds less 
costly in terms of developer resources since we can use the editor to 
output the flat file.

3) And finally yes ... you can already insert a penis into video uploads 
today. With something like: oggCat | ffmpeg2theora -i someVideo.ogg -s 
0 -e 42.2 myOneFramePenis.ogg  ffmpeg2theora -i someVideo.ogg -s 42.2
But yea its one more level to worry about and if its cheaper to do it 
internally (the transcodes not the penis insertion) we should do it 
internally. :P  (I hope other appreciate the multiple levels of humor here)

peace,
michael

Gregory Maxwell wrote:
 On Sat, Aug 1, 2009 at 2:54 AM, Brianbrian.min...@colorado.edu wrote:
   
 On Sat, Aug 1, 2009 at 12:47 AM, Gregory Maxwell gmaxw...@gmail.com wrote:
 
 On Sat, Aug 1, 2009 at 12:13 AM, Michael Dalemd...@wikimedia.org wrote:
 Once you factor in the ratio of video to non-video content for the
 for-seeable future this comes off looking like a time wasting
 boondoggle.
   
 I think you vastly underestimate the amount of video that will be uploaded.
 Michael is right in thinking big and thinking distributed. CPU cycles are
 not *that* cheap.
 

 Really rough back of the napkin numbers:

 My desktop has a X3360 CPU. You can build systems all day using this
 processor for $600 (I think I spent $500 on it 6 months ago).  There
 are processors with better price/performance available now, but I can
 benchmark on this.

 Commons is getting roughly 172076 uploads per month now across all
 media types.  Scans of single pages, photographs copied from flickr,
 audio pronouncations, videos, etc.

 If everyone switched to uploading 15 minute long SD videos instead of
 other things there would be 154,868,400 seconds of video uploaded to
 commons per-month. Truly a staggering amount. Assuming a 40 hour work
 week it would take over 250 people working full time just to *view*
 all of it.

 That number is an average rate of 58.9 seconds of video uploaded per
 second every second of the month.

 Using all four cores my desktop video encodes at 16x real-time (for
 moderate motion standard def input using the latest theora 1.1 svn).

 So you'd need less than four of those systems to keep up with the
 entire commons upload rate switched to 15 minute videos.  Okay, it
 would be slow at peak hours and you might wish to produce a couple of
 versions at different resolutions, so multiply that by a couple.

 This is what I meant by processing being cheap.

 If the uploads were all compressed at a bitrate of 4mbit/sec and that
 users were kind enough to spread their uploads out through the day and
 that the distributed system were perfectly efficient (only need to
 send one copy of the upload out), and if Wikimedia were only paying
 $10/mbit/sec/month for transit out of their primary dataceter... we'd
 find that the bandwidth costs of sending that source material out
 again would be $2356/month. (58.9 seconds per second * 4mbit/sec *
 $10/mbit/sec/month)

 (Since transit billing is on the 95th percentile 5 minute average of
 the greater of inbound or outbound uploads are basically free, but

Re: [Wikitech-l] w...@home Extension

2009-08-01 Thread Michael Dale


I had to program it anyway to support the distributing of the flattening 
of sequences. Which has been the planed approach for quite some time. I 
thought of the name and adding one-off support for transocoding 
recently, and hacked it up over the past few days.

This code will eventually support flattening of sequences. But adding 
code to do transcoding was a low hanging fruit feature and easy first 
step. We can now consider if its efficient to use the transcoding 
feature in wikimedia setup or not but I will use the code either way to 
support sequence flattening (which has to take place in the browser 
since there is no other easy way to guarantee wysiwyg flat 
representation of browser edited sequences )

peace,
--michael

Mike.lifeguard wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 BTW, Who's idea was this extension? I know Michael Dale is writing it,
 but was this something assigned to him by someone else? Was it discussed
 beforehand? Or is this just Michael's project through and through?

 Thanks,
 - -Mike
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.9 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iEYEARECAAYFAkp0zv4ACgkQst0AR/DaKHtFVACgyH8J835v8xDGMHL78D+pYrB7
 NB8AoMZVwO7gzg9+IYIlZh2Zb3zGG07q
 =tpEc
 -END PGP SIGNATURE-

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] w...@home Extension

2009-07-31 Thread Michael Dale

Want to point out the working prototype of the w...@home extension. 
Presently it focuses on a system for transcoding uploaded media to free 
formats, but will also be used for flattening sequences and maybe 
other things in the future ;)

Its still rough around the edges ... it presently features:
* Support for uploading a non-free media assets,

* putting those non free media assets into a jobs table and distributing 
the transcode job into $wgChunkDuration length encoding jobs. ( each 
pieces is uploaded then reassembled on the server. that way big 
transcoding jobs can be distributed to as many clients that are 
participating )

* It supports multiple derivatives for different resolutions based on 
the requested size.
** In the future I will add a hook for oggHanlder to use that as well .. 
since a big usability issue right now is users embedding HD or high res 
ogg videos into a small video space in an article ... and it naturally 
it performs slowly.

* It also features a JavaScript interface for clients to query for new 
jobs, get the job, download the asset, do transcode  upload it (all 
through an api module so people could build a client as a shell script 
if they wanted)
** In the future the interface will support preferences , basic 
statistics and more options like turn on w...@home every-time I visit 
wikipedia or only get jobs while I am away from my computer.

* I try and handle derivatives consistently with the file/ media 
handling system. So right now your uploaded non-free format file will be 
linked to on the file detail page and via the api calls. We should 
probably limit client exposure to non-free formats. Obviously they have 
the files be on a public url to be transcoded, but the interfaces for 
embedding and the stream detail page should link to the free format 
version at all times.

* I tie transcoded chunks to user ids this makes it easier to disable 
bad participants.
** I need to add an interface to delete derivatives if someone flags it 
as so.

* it supports $wgJobTimeOut for re-assigning jobs that don't get done in 
$wgJobTimeOut time.

This was hacked together over the past few days so its by no means 
production ready ... but should get there soon ;)  Feedback is welcome. 
Its in the svn at: /trunk/extensions/WikiAtHome/

peace,
michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] w...@home Extension

2009-07-31 Thread Michael Dale

Gregory Maxwell wrote:
 On Fri, Jul 31, 2009 at 9:51 PM, Michael Dalemd...@wikimedia.org wrote:
   
 the transcode job into $wgChunkDuration length encoding jobs. ( each
 pieces is uploaded then reassembled on the server. that way big
 transcoding jobs can be distributed to as many clients that are
 participating )
 

 This pretty much breaks the 'instant' gratification you currently get on 
 upload.
   

true... people will never upload to site without instant gratification ( 
cough youtube cough ) ...

At any rate its not replacing the firefogg  that has instant 
gratification at point of upload its ~just another option~...

Also I should add that this w...@home system just gives us distributed 
transcoding as a bonus side effect ... its real purpose will be to 
distribute the flattening of edited sequences. So that 1) IE users can 
view them 2) We can use effects that for the time being are too 
computationally expensive to render out in real-time in javascript 3) 
you can download and play the sequences with normal video players and 4) 
we can transclude sequences and use templates with changes propagating 
to flattened versions rendered on the w...@home distributed computer

While presently many machines in the wikimedia internal server cluster 
grind away at parsing and rendering html from wiki-text the situation is 
many orders of magnitude more costly with using transclution and temples 
with video ... so its good to get this type of extension out in the wild 
and warmed up for the near future ;)


 The segmenting is going to significant harm compression efficiency for
 any inter-frame coded output format unless you perform a two pass
 encode with the first past on the server to do keyframe location
 detection.  Because the stream will restart at cut points.

   

also true. Good thing theora-svn now supports two pass encoding :) ... 
but an extra key frame every 30 seconds properly wont hurt your 
compression efficiency too much.. vs the gain of having your hour long 
interview trans-code a hundred times faster than non-distributed 
conversion.  (almost instant gratification)  Once the cost of generating 
a derivative is on par with the cost of sending out the clip a few times 
for viewing lots of things become possible.
 * I tie transcoded chunks to user ids this makes it easier to disable
 bad participants.
 

 Tyler Durden will be sad.

 But this means that only logged in users will participate, no?
   

true...  You also have to log in to upload to commons  It will make 
life easier and make abuse of the system more difficult.. plus it can 
act as a motivation factor with distribu...@home teams, personal stats 
and all that jazz. Just as people like to have their name show up on the 
donate wall when making small financial contributions.

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Recommending a browser for video (was: Proposal: switch to HTML 5)

2009-07-09 Thread Michael Dale

This is really a foundation / wikimedia community question. ... I will 
do a short email to foundation-l summarizing the technical discussion. 
Not that foundation-l has historically been the best way to build 
consensus but maybe someone else can summarize that discussion and give 
us a ball-park of the community voice on the matter allowing the 
foundation to move forward with something.

Meanwhile I will try and make sure the new player is good and ready to 
be integrated ;)

--michael

Aryeh Gregor wrote:
 On Wed, Jul 8, 2009 at 6:12 PM, David Gerarddger...@gmail.com wrote:
   
 They are happy to foul up the entire standard. I feel there is little
 to no benefit to us in trying to imply that the situation is
 otherwise.
 

 First of all, Apple is not fouling up the entire standard.  They
 employ one of its two co-editors, their developers contribute to it
 very actively, and they ship an implementation that's as advanced as
 anybody's.  This is *one* specific feature that they've said they
 won't implement at the present time (but they may reconsider at any
 time).  Mozilla has vetoed features as well, as Ian Hickson has
 pointed out.  Mozilla refused to implement SQL, so that was removed
 from the standard, just as mention of Theora was.

 Second of all, I don't have a serious problem with Wikimedia only
 advocating the use of open-source software, say.  But if it does, it
 *must* be phrased in a way that makes it clear that it's an
 advertisement of a product we want the user to use, not a neutral
 assessment of what the best technology is for viewing the page.
 Anything else is deliberately misleading, and that's unacceptable.

 On Wed, Jul 8, 2009 at 6:21 PM, Gregory Maxwellgmaxw...@gmail.com wrote:
   
 Regardless, I think we've finished the technical part of this
 decision— the details are a matter of organization concern now, not
 technology.
 

 Yep, definitely.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Fwd: [whatwg] Serving up Theora video in the real world

2009-07-09 Thread Michael Dale

Tell the users to complain to Apple? .. Bring up anti-competitive 
lawsuits against apple? Buy a Mobil device that is less locked down? 
There is no easy solution when the platform is a walled garden. There 
are two paths towards supporting html5 video in mobile platforms.

1) getting things working within the provided web browser platform
or
2) running your own browser software as an application (we only should 
consider a normal phone obviously on a jail-broken device you can do 
lots of things...  but that greatly reduces the possibility of wide 
deployment)

I was looking at this situation for the iPhone and Android based phones. 
I think android based phones have a better shot at supporting ogg theora 
html5 video in the near term. In the long term the market will drive the 
devices to support ogg or not.

iPhone
1) The internals of the quicktime/media system for the iPhone are not 
very exposed nor do they appear to be very extendable.
2) The Apple SDK agreement forbids virtual machines of any kind. This 
effectively makes competing web browsers illegal.

Android / HTC phones:
1) I would hope google/android would ship theora/html5 support since 
theora will be supported in their desktop webkit based chrome browser. I 
think it would be relatively easy for a given android based phone 
distributer to support ogg once webkit on android supports html5 video.
2) Android recently added native code exposure: 
http://android-developers.blogspot.com/2009/06/introducing-android-15-ndk-release-1.html
I wonder if this could be a path for a port of Firefox or a custom 
version of the open source webkit browser on android?

--michael


David Gerard wrote:
 Another answer - it'd be custom app time.

 So the question is: what do we tell iPhone users?


 - d.



 -- Forwarded message --
 From: Maciej Stachowiak m...@apple.com
 Date: 2009/7/10
 Subject: Re: [whatwg] Serving up Theora video in the real world
 To: David Gerard dger...@gmail.com
 Cc: WHATWG Proposals wha...@lists.whatwg.org



 On Jul 9, 2009, at 2:59 PM, David Gerard wrote:

   
 The question is what to do for platforms such as the iPhone, which
 doesn't even run Java.

 Is there any way to install an additional codec in the iPhone browser?
 Is it (even theoretically) possible to put a free app on the AppStore
 just to play Ogg Theora video for our users? (There are many AppStore
 apps that support Ogg Vorbis, don't know if any support Theora - so
 presumably AppStore stuff doesn't give Apple the feared submarine
 patent exposure.)
 

 Just by way of factual information:

 There's no Java in the iPhone version of Safari. There are no browser
 plugins. There is no facility for systemwide codec plugins. There is
 no way to get an App Store app to launch automatically from Web
 content. I don't think there is any obstacle to posting an App Store
 app that does nothing but play videos from WikiPedia, the way the
 YouTube app plays YouTube videos. But I don't think there is a way to
 integrate it with browsing.

 Regards,
 Maciej

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Proposal: switch to HTML 5

2009-07-08 Thread Michael Dale

We need to inform people that the quality of experience can be 
substantially improved if they use a browser that supports free formats. 
Wikimedia only distributes content in free formats because if you have 
to pay for a licensee to view, edit or publish ~free content~ then the 
content is not really ~free~.

We have requested that Apple and IE support free formats but they have 
chosen not to. Therefore we are in a position where we have to recommend 
a browser that does have a high quality user experience in supporting 
the formats. We are still making every effort to display the formats in 
IE  Safari using java or plugins but we should inform people they can 
have an improved experience on par with proprietary solutions if they 
are using different browser.

--michael

Steve Bennett wrote:
 On Wed, Jul 8, 2009 at 4:43 PM, Marco
 Schusterma...@harddisk.is-a-geek.org wrote:
   
 We should not recommend Chrome - as good as it is, but it has serious
 privacy problems.
 

 Out of curiosity, why do we need to recommend a browser at all, and
 why do we think anyone will listen to our recommendation? People use
 the browser they use. If the site they want to go to doesn't work in
 their browser, they'll either not go there, or possibly try another
 one. They're certainly not going to change browsers just because the
 site told them to.

 Personally, I use Chrome, FF and IE. And the main reason for switching
 is just to have different sets of cookies. Occasionally a site doesn't
 like Chrome, so I switch. But it's not like I'm going to take a your
 experience would be better in browser statement seriously.

 Steve

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Proposal: switch to HTML 5

2009-07-07 Thread Michael Dale

I think if the playback system is java in ~any browser~ we should 
~softly~ inform people to get a browser with native support if they 
want a high quality video playback experience.

The cortado applet is awesome ... but startup time of the java vm is 
painful compared to other user experiences with video.. not to mention 
seeking, buffering, and general interface responsiveness in comparison 
to the native support.

--michael

Gregory Maxwell wrote:
 On Tue, Jul 7, 2009 at 4:23 PM, Brion Vibberbr...@wikimedia.org wrote:
   
 Unless they don't have Ogg support. :)

 *cough Safari cough*

 But if they do, yes; our JS won't bother bringing up the Java applet if
 it's got native support available.
 

 It would be a four or five line patch to make OggHandler nag Safari
 3/4 users to install XiphQT and give them the link to a download page.
  The spot for the nag is already stubbed out in the code. Just say the
 word.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Proposal: switch to HTML 5

2009-07-07 Thread Michael Dale

Also should be noted a simple patch for oggHandler to output video and 
use the mv_embed library is in the works see:
https://bugzilla.wikimedia.org/show_bug.cgi?id=18869

you can see it in action a few places like 
http://metavid.org/wiki/File:FolgersCoffe_512kb.1496.ogv

Also note my ~soft~ push for native support if you don't already native 
support. (per our short discussion earlier in this thread) if you say 
don't show again it sets a cookie and won't show it again.

I would be happy to randomly link to other browsers that support html5 
video tag with ogg as they ship with that functionality.

I don't really have apple machine handy to test quality of user 
experience in OSX safari with xiph-qt. But if that is on-par with 
Firefox native support we should probably link to the component install 
instructions for safari users.

--michael



Gregory Maxwell wrote:
 On Tue, Jul 7, 2009 at 1:54 AM, Aryeh
 Gregorsimetrical+wikil...@gmail.com wrote:
 [snip]
   
 * We could support video/audio on conformant user agents without
 the use of JavaScript.  There's no reason we should need JS for
 Firefox 3.5, Chrome 3, etc.
 


 Of course, that could be done without switching the rest of the site to 
 HTML5...

 Although I'm not sure that giving the actual video tags is desirable.
 It's a tradeoff:

 Work for those users when JS is enabled and correctly handle saving
 the full page including the videos vs take more traffic from clients
 doing range requests to generate the poster image, and potentially
 traffic from clients which decide to go ahead and fetch the whole
 video regardless of the user asking for it.

 There is also still a bug in FF3.5 that where the built-in video
 controls do not work when JS is fully disabled. (Because the controls
 are written in JS themselves)


 (To be clear to other people reading this the mediawiki ogghandler
 extension already uses HTML5 and works fine with Firefox 3.5, etc. But
 this only works if you have javascript enabled.  The site could
 instead embed the video elements directly, and only use JS to
 substitute the video tag for fallbacks when it detects that the video
 tag can't be used)

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minify

2009-06-26 Thread Michael Dale

I would quickly add that the script-loader / new-upload branch also 
supports minify along with associating unique id's grouping  gziping.

So all your mediaWiki page includes are tied to their version numbers 
and can be cached forever without 304 requests by the client or _shift_ 
reload to get new js.

Plus it works with all the static file based js includes as well. If a 
given set of files is constantly requested we can group them to avoid 
server round trips. And finally it lets us localize msg and package that 
in the JS (again avoiding separate trips for javascript interface msgs)

for more info see the ~slightly outdated~ document:  
http://www.mediawiki.org/wiki/Extension:ScriptLoader

peace,
michael
 
Robert Rohde wrote:
 I'm going to mention this here, because it might be of interest on the
 Wikimedia cluster (or it might not).

 Last night I deposited Extension:Minify which is essentially a
 lightweight wrapper for the YUI CSS compressor and JSMin JavaScript
 compressor.  If installed it automatically captures all content
 exported through action=raw and precompresses it by removing comments,
 formatting, and other human readable elements.  All of the helpful
 elements still remain on the Mediawiki: pages, but they just don't get
 sent to users.

 Currently each page served to anons references 6 CSS/JS pages
 dynamically prepared by Mediawiki, of which 4 would be needed in the
 most common situation of viewing content online (i.e. assuming
 media=print and media=handheld are not downloaded in the typical
 case).

 These 4 pages, Mediawiki:Common.css, Mediawiki:Monobook.css, gen=css,
 and gen=js comprise about 60 kB on the English Wikipedia.  (I'm using
 enwiki as a benchmark, but Commons and dewiki also have similar
 numbers to those discussed below.)

 After gzip compression, which I assume is available on most HTTP
 transactions these days, they total 17039 bytes.  The comparable
 numbers if Minify is applied are 35 kB raw and 9980 after gzip, for a
 savings of 7 kB or about 40% of the total file size.

 Now in practical terms 7 kB could shave ~1.5s off a 36 kbps dialup
 connection.  Or given Erik Zachte's observation that action=raw is
 called 500 million times per day, and assuming up to 7 kB / 4 savings
 per call, could shave up to 900 GB off of Wikimedia's daily traffic.
 (In practice, it would probably be somewhat less.  900 GB seems to be
 slightly under 2% of Wikimedia's total daily traffic if I am reading
 the charts correctly.)


 Anyway, that's the use case (such as it is): slightly faster initial
 downloads and a small but probably measurable impact on total
 bandwidth.  The trade-off of course being that users receive CSS and
 JS pages from action=raw that are largely unreadable.  The extension
 exists if Wikimedia is interested, though to be honest I primarily
 created it for use with my own more tightly bandwidth constrained
 sites.

 -Robert Rohde

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minify

2009-06-26 Thread Michael Dale

correct me if I am wrong but thats how we presently update js and css.. 
we have $wgStyleVersion and when that gets updated we send out fresh 
pages with html pointing to js with $wgStyleVersion append.

The difference in the context of the script-loader is we would read the 
version from the mediaWiki js pages that are being included and the 
$wgStyleVersion var. (avoiding the need to shift reload) ... in the 
context of rendering a normal page with dozens of template lookups I 
don't see this a particularly costly. Its a few extra getLatestRevID 
title calls. Likewise we should do this for images so we can send the 
cache forever header (bug 17577) avoiding a bunch of 304 requests.

One part I am not completely clear on is how we avoid lots of 
simultaneous requests to the scriptLoader when it first generates the 
JavaScript to be cached on the squids, but other stuff must be throttled 
too no? Like when we update any code, language msgs, or local-settings 
does that does not result in the immediate purging all of wikipedia.

--michael

Gregory Maxwell wrote:
 On Fri, Jun 26, 2009 at 4:33 PM, Michael Dalemd...@wikimedia.org wrote:
   
 I would quickly add that the script-loader / new-upload branch also
 supports minify along with associating unique id's grouping  gziping.

 So all your mediaWiki page includes are tied to their version numbers
 and can be cached forever without 304 requests by the client or _shift_
 reload to get new js.
 

 Hm. Unique ids?

 Does this mean the every page on the site must be purged from the
 caches to cause all requests to see a new version number?

 Is there also some pending squid patch to let it jam in a new ID
 number on the fly for every request? Or have I misunderstood what this
 does?

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Minify

2009-06-26 Thread Michael Dale

Aryeh Gregor wrote:
 Any given image is not included on every single page on the wiki.
 Purging a few thousand pages from Squid on an image reupload (should
 be rare for such a heavily-used image) is okay.  Purging every single
 page on the wiki is not.

   
yea .. we are just talking about adding image.jpg?image_revision_id  to 
all the image src at page render time should never purge everything on 
the wiki ;)
 No.  We don't purge Squid on these events, we just let people see old
 copies.  Of course, this doesn't normally apply to registered users
 (who usually [always?] get Squid misses), or to pages that aren't
 cached (edit, history, . . .).
   
oky thats basically what I understood. That makes sense.. although it 
would be nice to think about a job or process that purges pages with 
outdated language msg, or pages that are referencing outdated scripts, 
style-sheet, or image urls.

We ~do~ add jobs to purge for template updates. Are other things like 
language msg  code updates candidates for job purge tasks? ... I guess 
its not too big a deal to get an old page until someone updates it.

--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Foundation-l] Why don't we re-encode proprietary formats as Ogg?

2009-06-08 Thread Michael Dale

I am definitely not opposed to adding in that functionality as I have 
mentioned in the past:
see thread: 
http://www.mail-archive.com/wikitech-l@lists.wikimedia.org/msg00888.html

You should take a look at the work Mike Baynton did back in summer of 
code 07.

The issue that we have is both the Bottlenecks you mentioned. Where 
possible we want to crowd source the transcoding costs and we have a 
working firefogg support which we can more aggressively push out once 
firefox 3.5 lands.

Essentially wikimedia commons is not designed to support hosting the raw 
footage. Other like-minded organizations like archive.org that have 
peta-bytes of storage across thousands of storage nodes are better 
positioned to act as a host for raw footage and its derivatives. 
Additional commons is a strict archive where files not licensed 
properly often get removed where archive.org can act as a more permanent 
storage space while license issues are sorted out.

Since wikimedia projects will shortly be supporting linline searching 
time segment grabbing from archive.org material its maybe not so 
critical that we create and host the transcoding infrastructure ourselves.

Although as you mention it would be nice to support transocoding on 
wikimedias servers for uploading short clips from cell-phone type cases.

--michael

David Gerard wrote:
 [cc'd back to wikitech-l]

 2009/6/8 Tim Starling tstarl...@wikimedia.org:

   
 It's been discussed since OggHandler was invented in 2007, and I've
 always been in favour of it. But the code hasn't materialised, despite
 a Google Summer of Code project come and gone that was meant to
 implement a transcoding queue. The transcoding queue project was meant
 to allow transformations in quality and size, but it would also allow
 format changes without much trouble.
 


 Ahhh, that's fantastic, so it is just a Simple Matter of Programming :-D

 (I'm tempted to bodge something together myself, despite my low
 opinion of my own coding abilities ;-) )

 Start simple. Upload your phone and camera video files! We'll
 transcode them into Theora and store them. Pick suitable (tweakable)
 defaults. Get it doing that one job. Then we can think about
 size/quality transformations later. Sound like a vague plan?

 Bottlenecks: 1. CPU to transcode with. 2. Disk space for queued video.


 - d.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] firefogg local encode new-upload branch update.

2009-06-04 Thread Michael Dale

As you may know I have been working on firefogg integration with 
mediaWiki. As you may also know the mwEmbed library is being designed to 
support embedding of these interfaces in arbitrary external contexts.  I 
wanted to quickly highlight a useful stand alone usage example of the 
library:

http://www.firefogg.org/make/advanced.html

This Make Ogg link will be something you can send to a person so they 
can encode source footage to a local ogg video file with the latest and 
greatest ogg encoders (presently the thusnelda theora encoder   vorbis 
audio). Updates to thusnelda and other free codecs will be pushed out 
via firefogg updates.

For commons / wikimedia usage we will directly integrate firefogg (using 
that same codebase) You can see an example of how that works on the 
'new-upload' branch here: 
http://sandbox.kaltura.com/testwiki/index.php/Special:Upload ... 
hopefully we will start putting some of this on testing.wikipedia.org 
~soonish ?~

The new-upload branch feature set is quite extensive including the 
script-loader, jquery javascript refactoring, the new upload-api, new 
mv_embed video player, add media wizard etc. Any feedback and specific 
bug reports people can do will be super helpful in gearing up for 
merging this 'new-upload' branch.

For an overview see:
http://www.mediawiki.org/wiki/Media_Projects_Overview

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] more bugzilla components

2009-05-26 Thread Michael Dale

I would like to request categorization for the media projects to the bug 
tracker. To get a brief idea of the components getting packaged into the 
new-upload branch check out:
http://www.mediawiki.org/wiki/Media_Projects_Overview

I think the large scope of code and the fact that MwEmbed can be used in 
self-contained mode warrants a high level Product categorization. 
Something like MwEmbed :  The self contained jQuery based javascript 
library for embedding mediaWiki interfaces:

Then components for the library could (presently) include the following:
* Add Media Wizard
* Firefogg
* Clip Edit
* Embed Video
* Sequence Editor
* Timed Text
* example usage
* js Script-Loader
* Themes and Styles
*
*I also want to report some strangeness with bugzilla. I sometimes get 
the below error when trying to log in (without restrict to ip checked 
) and I occasionally get time-outs when submitting bugs:

Undef to trick_taint at Bugzilla/Util.pm line 67
Bugzilla::Util::trick_taint('undef') called at 
Bugzilla/Auth/Persist/Cookie.pm line 61

Bugzilla::Auth::Persist::Cookie::persist_login('Bugzilla::Auth::Persist::Cookie=ARRAY(0xXX)',
 
'Bugzilla::User=HASH(0xXX)') called at Bugzilla/Auth.pm line 147

Bugzilla::Auth::_handle_login_result('Bugzilla::Auth=ARRAY(0xXX)', 
'HASH(0xX)', 2) called at Bugzilla/Auth.pm line 92
Bugzilla::Auth::login('Bugzilla::Auth=ARRAY(0xX)', 2) called at 
Bugzilla.pm line 232
Bugzilla::login('Bugzilla', 0) called at 
/srv/org/wikimedia/bugzilla/relogin.cgi line 192


peace,
michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] request for feedback on new-upload branch

2009-05-07 Thread Michael Dale

The new-upload branch includes a good set of new features is available 
here:
http://svn.wikimedia.org/svnroot/mediawiki/branches/new-upload/phase3/

Major Additions:

* action=upload added to the api

* Supports New upload Interfaces (dependent on mv_embed / jQuery libs )
** supports upload over http with reporting progress to user.
** support for chunks upload with progress indicators and client side 
transcoding for videos (chunks for other large flies types almost 
there)  (dependent on the firefogg extension)
** supports upload-api error msg lookup and report back.

To test the new upload /interfaces/ you need the latest copy of mv_embed 
/ add_media_wizard.js You can get by checking out: 
http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/MetavidWiki/skins/
(add then adding something like the following to localSettings.php :

$wgExtensionFunctions[] = 'addMediaWizard';
function addMediaWizard(){
global $wgOut;
$utime = time();
$wgOut-addScript( script type=\{$wgJsMimeType}\ 
src=\http://localhost/{path_to_skins_dir}/add_media_wizard.js?urid={$utime}\;/script
 
);
}

Comments or references to new bugs can be reported to the following bug 
18563 which is tracking its inclusion.

== Things already on the TOOD list ==
* Deprecate upload.js (destination checks, filename checks etc) in favor 
of the jQuery / mv_embed style upload interface. (this is dependent on 
getting the script-loader branch into the trunk bug 18464  (along with 
base mv_embed / jQuery libs) ) will have a separate email with call for 
feedback on that branch shortly once I finish up the css style sheet 
grouping and mv_embed lib merging.

--mv_embed upload interfaces--
* Support pass though mode for firefogg (passthough mode only in 
development branches of firefogg ... should be released soon)
* Support remote iframe driver ( for uploading to commons with progress 
indicators while editing an article on another site (like wikipedia) ) 
(will be an upload tab in the add_media_wizard)

peace,
michael



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Google Summer of Code: accepted projects

2009-04-24 Thread Michael Dale

Roan Kattouw wrote:
 The problem here seems to be that thumbnail generation times vary a
 lot, based on format and size of the original image. It could be 10 ms
 for one image and 10 s for another, who knows.

   
yea again if we only issue the big resize operation on initial upload 
with a memory friendly in-place library like vips I think we will be 
oky. Since the user just waited like 10-15 minutes to upload their huge 
image waiting an additional 10-30s at that point for thumbnail and 
instant gratification of seeing your image on the upload page ... is 
not such a big deal.  Then in-page use derivatives could predictably 
resize the 1024x786 ~or so~ image in realtime again instant 
gratification on page preview or page save.

Operationally this could go out to a thumbnail server or be done on the 
apaches if they are small operations it may be easier to keep the 
existing infrastructure than to intelligently handle the edge cases 
outlined. ( many resize request at once, placeholders, image proxy / 
deamon setup)

 AFAICT this isn't about optimization, it's about not bogging down the
 Apache that has the misfortune of getting the first request to thumb a
 huge image (but having a dedicated server for that instead), and about
 not letting the associated user wait for ages. Even worse, requests
 that thumb very large images could hit the 30s execution limit and
 fail, which means those thumbs will never be generated but every user
 requesting it will have a request last for 30s and time out.

   

Again this may be related to the use of unpredictable memory usage of 
image-magic when resizing large images instead of a fast memory confined 
resize engine, no?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Google Summer of Code: accepted projects

2009-04-22 Thread Michael Dale

Aryeh Gregor wrote:
 I'm not clear on why we don't just make the daemon synchronously
 return a result the way ImageMagick effectively does.  Given the level
 of reuse of thumbnails, it seems unlikely that the latency is a
 significant concern -- virtually no requests will ever actually wait
 on it.
   
( I basically outlined these issues on the soc page but here they are 
again with at bit more clarity )

I recommended that the image daemon run semi-synchronously since the 
changes needed to maintain multiple states and return non-cached 
place-holder images while managing updates and page purges for when the 
updated images are available within the wikimedia server architecture 
probably won't be completed in the summer of code time-line. But if the 
student is up for it the concept would be useful for other components 
like video transformation / transcoding, sequence flattening etc. But 
its not what I would recommend for the summer of code time-line.

== per issues outlined in bug 4854 ==
I don't think its a good idea to invest a lot of energy into a separate 
python based image daemon. It won't avoid all  problems listed in bug 4854

Shell-character-exploit issues should be checked against anyway (since 
not everyone is going to install the daemon)

Other people using mediaWiki won't add a python or java based image 
resize and resolve dependency python or java  component  libraries. It 
won't be easier to install than imagemagick or php-gd that are 
repository hosted applications and already present in shared hosting 
environments.

Once you start integrating other libs like (java) Batik it becomes 
difficult to resolve dependencies (java, python etc) and to install you 
have to push out a new program that is not integrated into all the 
application repository manages for the various distributions. 

Potential to isolate CPU and memory usage should be considered in the 
core medaiWiki image resize support anyway . ie we don't want to crash 
other peoples servers who are using mediaWiki by not checking upper 
bounds of image transforms. Instead we should make the core image 
transform smarter maybe have a configuration var that /attempts/ to bind 
the upper memory for spawned processing and take that into account 
before issuing the shell command for a given large image transformation 
with a given sell application.

== what would probably be better for the image resize efforts should 
focus on ===

(1) making the existing system more robust and (2) better taking 
advantage of multi-threaded servers.

(1) right now the system chokes on large images we should deploy support 
for an in-place image resize maybe something like vips (?) 
(http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use) 
The system should intelligently call vips to transform the image to a 
reasonable size at time of upload then use those derivative for just in 
time thumbs for articles. ( If vips is unavailable we don't transform 
and we don't crash the apache node.)

(2) maybe spinning out the image transform process early on in the 
parsing of the page with a place-holder and callback so by the time all 
the templates and links have been looked up the image is ready for 
output. (maybe another function wfShellBackgroundExec($cmd, 
$callback_function) (maybe using |pcntl_fork then normal |wfShellExec 
then| ||pcntl_waitpid then callback function ... which sets some var in 
the parent process so that pageOutput knows its good to go) |

If operationally the daemon should be on a separate server we should 
still more or less run synchronously ... as mentioned above ... if 
possible the daemon should be php based so we don't explode the 
dependencies for deploying robust image handling with mediaWiki.

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Skin JS cleanup and jQuery

2009-04-22 Thread Michael Dale

hmm right...
The idea of the scriptLoader is we get all our #1 included javascript in 
a single request. So we don't have round trips that would benefit as 
much from lazy loading so no need to rewrite stuff that is included that 
way already.

I don't think we are proposing convert all scripts to #2 or #3 
loading...  We already have the importScriptURI function which script 
use for loading when not using #1.

I do suggest we move away from importScriptURI to something like the 
doLoad function in mv_embed ... that way we can load multiple js files 
in a single request using the mediaWiki scriptServer (if its enabled). 
Right now all the importScriptURI stuff works non-blocking and included 
scripts need to include code to execute anything they want to run. To 
make things more maintainable and modular we should transition to 
objects/classes providing methods which can be extended and autoloaded 
rather than lots of single files doing lots of actions on the page in a 
less structured fashion. But there is no rush to transition as the 
scripts are working as is and the new infrastructure will work with 
the scripts as they are.

But the idea of the new infrastructure is to support that functionality 
in the future...

--michael

Sergey Chernyshev wrote:
 No, my link is about 3 ways of loading:

1. Normal script tags (current style)
2. Asynchronous Script Loading (loading scripts without blocking, but
without waiting for onload)
3. Lazyloading (loading script onload).

 Number 2 might be usable as well.

 In any case changing all MW and Extensions code to work for #2 or #3 might
 be a hard thing.

 Thank you,

 Sergey


 --
 Sergey Chernyshev
 http://www.sergeychernyshev.com/


 On Wed, Apr 22, 2009 at 1:21 PM, Michael Dale md...@wikimedia.org wrote:

   
 The mv_embed.js includes a doLoad function that matches the autoLoadJS
 classes listed in mediaWiki php. So you can dynamically autoload
 arbitrary sets of classes (js-files in the mediaWiki software) in a
 single http request and then run something once they are loaded.
 It can also autoload sets of wiki-titles for user-space scripts again
 in a single request grouping, localizing, gziping and caching all the
 requested wiki-title js in a single request. This is nifty cuz say your
 script has localized msg. You can fill these in in user-space
 MediaWiki:myMsg then put them in the header of your user-script, then
 have localized msg in user-space javascript ;) .. When I get a chance I
 will better document this ;) But its basically outlined here:
 http://www.mediawiki.org/wiki/Extension:ScriptLoader

 The link you highlight appears to be about running stuff once the page
 is ready. jQuery includes a function $(document).ready(function(){
 //code to run now that the dom-state is ready }) so your enabled gadget
 could use that to make sure the dom is ready before executing some
 functions.

 (Depending on the type of js functionality your adding it /may/ be
 better to load on-demand once a new interface component is invoked
 rather than front load everything. Looking at the add-media-wizard
 gadget on testing.wikipedia.org for an idea of how this works.

 peace,
 --michael

 Sergey Chernyshev wrote:
 
 Yep, with jQuery in the core it's probably best to just bundle it.

 There is another issue with the code loading and stuff - making JS
   
 libraries
 
 call a callback function when they load and all the functionality to be
 there instead of relying on browser to block everything until library is
 loaded. This is quite advance thing considering that all the code will
   
 have
 
 to be converted to this model, but it will allow for much better
   
 performance
 
 when implemented. Still it's probably Phase 5 kind of optimization, but
   
 it
 
 can bring really good results considering JS being the biggest blocker.

 More on the topic is on Steve Souders' blog:
 http://www.stevesouders.com/blog/2008/12/27/coupling-async-scripts/

 Thank you,

 Sergey


 --
 Sergey Chernyshev
 http://www.sergeychernyshev.com/


 On Wed, Apr 22, 2009 at 12:42 PM, Brion Vibber br...@wikimedia.org
   
 wrote:
 
   
 On 4/22/09 9:33 AM, Sergey Chernyshev wrote:

 
 Exactly because this is the kind of requests we're going to get, I
   
 think
 
 it

 
 makes sense not to have any library bundled by default, but have a
 centralized handling for libraries, e.g. one extension asks for latest
 jQuery and latest YUI and MW loads them, another extension asks for

   
 jQuery

 
 only and so on.

   
 Considering we want core code to be able to use jQuery, I think the case
 for bundling it is pretty strong. :)

 -- brion

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


 
 ___
 Wikitech-l

1 2 >

1 - 100 of 124 matches

Mail list logo