Re: [Wikitech-l] Upload file size limit

2010-07-22 Thread Victor Vasiliev
On Tue, Jul 20, 2010 at 7:19 PM, Max Semenik maxsem.w...@gmail.com wrote:
 There's also Flash that can do it, however it's being ignored due
 to its proprietary nature.

May we drop our ideological concerns and implement multiple ways of
uploading, including Flash and Java applets?

--vvv

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Platonides
Tim Starling wrote:
 The problem is just that increasing the limits in our main Squid and
 Apache pool would create DoS vulnerabilities, including the prospect
 of accidental DoS. We could offer this service via another domain
 name, with a specially-configured webserver, and a higher level of
 access control compared to ordinary upload to avoid DoS, but there is
 no support for that in MediaWiki.
 
 We could theoretically allow uploads of several gigabytes this way,
 which is about as large as we want files to be anyway. People with
 flaky internet connections would hit the problem of the lack of
 resuming, but it would work for some.
 
 -- Tim Starling

I don't think it wouldn't be a problem for MediaWiki if we wanted to go
this route. There could be eg. http://upload.en.wikipedia.org/ which
redirected all wiki pages but Special:Upload to http://en.wikipedia.org/

The normal Special:Upload would need a redirect there, for accesses
not going via $wgUploadNagivationUrl, but that's a couple of lines.

Having the normal apaches handle uploads instead of a dedicated pool has
some issues, including the DoS you mention, filled /tmp/s, needing write
access to storage via nfs...


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Aryeh Gregor
On Wed, Jul 21, 2010 at 12:31 AM, Neil Kandalgaonkar
ne...@wikimedia.org wrote:
 Here's a demo which implements an EXIF reader for JPEGs in Javascript,
 which reads the file as a stream of bytes.

   http://demos.hacks.mozilla.org/openweb/FileAPI/

 So, as you can see, we do have a form of BLOB access.

But only by reading the whole file into memory, right?  That doesn't
adequately address the use-case we're discussing in this thread
(uploading files  100 MB in chunks).

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Mark A. Hershberger
Michael Dale md...@wikimedia.org writes:

 * Modern html5 browsers are starting to be able to natively split files 
 up into chunks and do separate 1 meg xhr posts. Firefogg extension does 
 something similar with extension javascript.

Could you point me to the specs that the html5 browsers are using?
Would it be possible to just make Firefogg mimic this same protocol for
pre-html5 Firefox?

 * We should really get the chunk uploading reviewed and deployed. Tim 
 expressed some concerns with the chunk uploading protocol which we 
 addressed client side, but I don't he had time to follow up with 
 proposed changes that we made for server api.

If you can point me to Tim's proposed server-side changes, I'll have a
look.

Mark.

-- 
http://hexmode.com/

Embrace Ignorance.  Just don't get too attached.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Aryeh Gregor
On Wed, Jul 21, 2010 at 11:19 AM, Mark A. Hershberger m...@everybody.org 
wrote:
 Could you point me to the specs that the html5 browsers are using?
 Would it be possible to just make Firefogg mimic this same protocol for
 pre-html5 Firefox?

The relevant spec is here:

http://www.w3.org/TR/FileAPI/

Firefox 3.6 doesn't implement it exactly, since it was changed after
Firefox's implementation, but the changes should mostly be compatible
(as I understand it).  But it's not good enough for large files, since
it has to read them into memory.

But anyway, what's the point in telling people to install an extension
if we can just tell them to upgrade Firefox?  Something like
two-thirds of our Firefox users are already on 3.6:

http://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Michael Dale
On 07/20/2010 10:24 PM, Tim Starling wrote:
 The problem is just that increasing the limits in our main Squid and
 Apache pool would create DoS vulnerabilities, including the prospect
 of accidental DoS. We could offer this service via another domain
 name, with a specially-configured webserver, and a higher level of
 access control compared to ordinary upload to avoid DoS, but there is
 no support for that in MediaWiki.

 We could theoretically allow uploads of several gigabytes this way,
 which is about as large as we want files to be anyway. People with
 flaky internet connections would hit the problem of the lack of
 resuming, but it would work for some.

yes in theory we could do that ... or we could support some simple chunk 
uploading protocol for which there is *already* basic support written, 
and will be supported in native js over time.

The firefogg protocol is almost identical to the plupload protocol. The 
main difference is firefogg requests a unique upload parameter / url 
back from the server so that if you uploaded identical named files they 
would not mangle the chunking. From a quick look at upload.php of 
plupload it appears plupload relies on the filename and a extra chunk 
url parameter != 0 request parameter. The other difference is firefogg 
has an explicit done = 1 in the request parameter to signify the end of 
chunks.

We requested feedback for adding a chunk id to the firefogg chunk 
protocol with each posted chunk to gard againt cases where the outer 
caches report an error but the backend got the file anyway. This way the 
backend can check the chunk index and not append the same chunk twice 
even if their are errors at other levels of the server response that 
cause the client to resend the same chunk.

Either way, if Tim says that plupload chunk protocol is superior then 
why discuss it? We can easily shift the chunks api to that and *move 
forward* with supporting larger file uploads. Is that at all agreeable?

peace,
--michael

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-21 Thread Aryeh Gregor
On Wed, Jul 21, 2010 at 2:05 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 This is the right place to bring it up:

 http://lists.w3.org/Archives/Public/public-webapps/

 I think the right API change would be to just allow slicing a Blob up
 into other Blobs by byte range.  It should be simple to both spec and
 implement.  But it might have been discussed before, so best to look
 in the archives first.

Aha, I finally found it.  It's in the spec already:

http://dev.w3.org/2006/webapi/FileAPI/#dfn-slice

So once you have a File object, you should be able to call
file.slice(pos, 1024*1024) to get a Blob object that's 1024*1024 bytes
long starting at pos.  Of course, this surely won't be reliably
available in all browsers for several years yet, so best not to pin
our hopes on it.  Chrome apparently implements some or all of the File
API in version 6, but I can't figure out if it includes this part.
Firefox doesn't yet according to MDC.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Upload file size limit

2010-07-20 Thread Lars Aronsson
Time and again, the 100 MB limit on file uploads is a problem,
in particular for multipage documents (scanned books) in PDF
or Djvu, and for video files in OGV.

What are the plans for increasing this limit? Would it be
possible to allow 500 MB or 1 GB for these file formats,
and maintain the lower limit for other formats?


-- 
   Lars Aronsson (l...@aronsson.se)
   Aronsson Datateknik - http://aronsson.se



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Roan Kattouw
2010/7/20 Lars Aronsson l...@aronsson.se:
 Time and again, the 100 MB limit on file uploads is a problem,
 in particular for multipage documents (scanned books) in PDF
 or Djvu, and for video files in OGV.

 What are the plans for increasing this limit? Would it be
 possible to allow 500 MB or 1 GB for these file formats,
 and maintain the lower limit for other formats?

There is support for chunked uploading in MediaWiki core, but it's
disabled for security reasons AFAIK. With chunked uploading, you're
uploading your file in chunks of 1 MB, which means that the impact of
failure for large uploads is vastly reduced (if a chunk fails, you
just reupload that chunk) and that progress bars can be implemented.
This does need client-side support, e.g. using the Firefogg extension
for Firefox or a bot framework that knows about chunked uploads. This
probably means the upload limit can be raised, but don't quote me on
that.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Daniel Kinzler
Lars Aronsson schrieb:
 What are the plans for increasing this limit? Would it be
 possible to allow 500 MB or 1 GB for these file formats,
 and maintain the lower limit for other formats?

As far as I know, we are hitting the limits of http here. Increasing the upload
limit as such isn't a solution, and a per-file-type setting doesn't help, since
the limit strikes before php is even started. It's on the server level.

The solution are chunked uploads. Which people have been working on for a
while, but I have no idea what the current status is.

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Huib Laurens
Hello Lars,

I don´t think the problem is raising it to 200mb, or 150 mb but 500mb or 1
gb are a lot higher and can cause problems

Anomynous ftp access sounds like a very very  very bad and evil solution...



-- 
Huib Abigor Laurens

Tech team

www.wikiweet.nl - www.llamadawiki.nl - www.forgotten-beauty.com -
www.wickedway.nl - www.huiblaurens.nl - www.wikiweet.org
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Roan Kattouw
2010/7/20 Max Semenik maxsem.w...@gmail.com:
 On 20.07.2010, 19:12 Lars wrote:

 Requiring special client software is a problem. Is that really
 the only possible solution?

 There's also Flash that can do it, however it's being ignored due
 to its proprietary nature.

Java applet?

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit (frontend)

2010-07-20 Thread Neil Kandalgaonkar
I hope to begin to address this problem with the new UploadWizard, at 
least the frontend issues. This isn't really part of our mandate, but I 
am hoping to add in chunked uploads for bleeding-edge browsers like 
Firefox 3.6+ and 4.0. Then you can upload files of whatever size you want.

I've written it to support what I'm calling multiple transport 
mechanisms; some using simple HTTP uploads, and some more exotic methods 
like Mozilla's FileAPI.

At this point, we're not considering adding any new technologies like 
Java or Flash to the mix, although these are the standard ways that 
people do usable uploads on the web. Flash isn't considered open enough, 
and Java seemed like a radical break.

I could see a role for helper applets or SWFs, but it's not on the 
agenda at this time. Right now we're trying to deliver something that 
fits the bill, using standard MediaWiki technologies (HTML, JS, and PHP).

I'll post again to the list if I get a FileAPI upload working. Or, if 
someone is really interested, I'll help them get started.


On 7/20/10 11:28 AM, Platonides wrote:
 Roan Kattouw wrote:
 2010/7/20 Max Semenikmaxsem.w...@gmail.com:
 On 20.07.2010, 19:12 Lars wrote:
 Requiring special client software is a problem. Is that really
 the only possible solution?

 There's also Flash that can do it, however it's being ignored due
 to its proprietary nature.

 Java applet?

 Roan Kattouw (Catrope)

 Or a modern browser using FileReader.

 http://hacks.mozilla.org/2010/06/html5-adoption-stories-box-net-and-html5-drag-and-drop/



 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


-- 
Neil Kandalgaonkar  |) ne...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Aryeh Gregor
On Tue, Jul 20, 2010 at 2:28 PM, Platonides platoni...@gmail.com wrote:
 Or a modern browser using FileReader.

 http://hacks.mozilla.org/2010/06/html5-adoption-stories-box-net-and-html5-drag-and-drop/

This would be best, but unfortunately it's not yet usable for large
files -- it has to read the entire file into memory on the client.
This post discusses a better interface that's being deployed:

http://hacks.mozilla.org/2010/07/firefox-4-formdata-and-the-new-file-url-object/

But I don't think it actually addresses our use-case.  We'd want the
ability to slice up a File object into Blobs and handle those
separately, and I don't see it in the specs.  I'll ask.  Anyway, I
don't think this is feasible just yet, sadly.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Tim Starling
On 21/07/10 00:30, Roan Kattouw wrote:
 There is support for chunked uploading in MediaWiki core, but it's
 disabled for security reasons AFAIK. With chunked uploading, you're
 uploading your file in chunks of 1 MB, which means that the impact of
 failure for large uploads is vastly reduced (if a chunk fails, you
 just reupload that chunk) and that progress bars can be implemented.
 This does need client-side support, e.g. using the Firefogg extension
 for Firefox or a bot framework that knows about chunked uploads. This
 probably means the upload limit can be raised, but don't quote me on
 that.

Firefogg support has been moved out to an extension, and that
extension was not complete last time I checked. There was chunked
upload support in the API, but it was Firefogg-specific, no
client-neutral protocol has been proposed. The Firefogg chunking
protocol itself is poorly thought-out and buggy, it's not the sort of
thing you'd want to use by choice, with a non-Firefogg client.

Note that it's not necessary to use Firefogg to get chunked uploads,
there are lots of available technologies which users are more likely
to have installed already. See the chunking line in the support
matrix at http://www.plupload.com/

When I reviewed Firefogg, I found an extremely serious CSRF
vulnerability in it. They say they have fixed it now, but I'd still be
more comfortable promoting better-studied client-side extensions, if
we have to promote a client-side extension at all.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Tim Starling
On 21/07/10 00:32, Daniel Kinzler wrote:
 Lars Aronsson schrieb:
 What are the plans for increasing this limit? Would it be
 possible to allow 500 MB or 1 GB for these file formats,
 and maintain the lower limit for other formats?
 
 As far as I know, we are hitting the limits of http here. Increasing the 
 upload
 limit as such isn't a solution, and a per-file-type setting doesn't help, 
 since
 the limit strikes before php is even started. It's on the server level.

The problem is just that increasing the limits in our main Squid and
Apache pool would create DoS vulnerabilities, including the prospect
of accidental DoS. We could offer this service via another domain
name, with a specially-configured webserver, and a higher level of
access control compared to ordinary upload to avoid DoS, but there is
no support for that in MediaWiki.

We could theoretically allow uploads of several gigabytes this way,
which is about as large as we want files to be anyway. People with
flaky internet connections would hit the problem of the lack of
resuming, but it would work for some.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit (backend)

2010-07-20 Thread Neil Kandalgaonkar
On 7/20/10 9:57 AM, Michael Dale wrote:
 * The reason for the 100meg limit has to do with php and apache and how
 it stores the uploaded POST in memory so setting the limit higher would
 risk increasing chances of apaches hitting swap if multiple uploads
 happened on a given box.

I've heard others say that -- this may have been true before, but I'm 
pretty sure it's not true any in PHP 5.2 or greater.

I've been doing some tests with large uploads (around 50MB) and I don't 
observe any Apache process getting that large. Instead it writes a 
temporary file. I checked out the source where it handles uploads and 
they seem to be taking care not to slurp the whole thing into memory. 
(lines 1061-1106)

http://svn.php.net/viewvc/php/php-src/trunk/main/rfc1867.c?view=markup

So, there may be other reasons not to upload a very large file, but I 
don't think this is one of them.

-- 
Neil Kandalgaonkar  |) ne...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Neil Kandalgaonkar
On 7/20/10 6:34 PM, Aryeh Gregor wrote:
 On Tue, Jul 20, 2010 at 2:28 PM, Platonidesplatoni...@gmail.com  wrote:
 Or a modern browser using FileReader.

 http://hacks.mozilla.org/2010/06/html5-adoption-stories-box-net-and-html5-drag-and-drop/

 This would be best, but unfortunately it's not yet usable for large
 files -- it has to read the entire file into memory on the client.
  [...]
  But I don't think it actually addresses our use-case.  We'd want the
  ability to slice up a File object into Blobs and handle those
  separately, and I don't see it in the specs.  I'll ask.  Anyway, I
  don't think this is feasible just yet, sadly.

Here's a demo which implements an EXIF reader for JPEGs in Javascript, 
which reads the file as a stream of bytes.

   http://demos.hacks.mozilla.org/openweb/FileAPI/

So, as you can see, we do have a form of BLOB access.

So you're right that these newer Firefox File* APIs aren't what we want 
for uploading extremely large images (50MB or so). But I can easily see 
using this to slice up anything smaller for chunk-oriented APIs.

-- 
Neil Kandalgaonkar  |) ne...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload file size limit

2010-07-20 Thread Neil Kandalgaonkar
On 7/20/10 8:08 PM, Tim Starling wrote:

 The Firefogg chunking
 protocol itself is poorly thought-out and buggy, it's not the sort of
 thing you'd want to use by choice, with a non-Firefogg client.

What in your view would a better version look like?

The PLupload protocol seems quite similar. I might be missing some 
subtle difference.


 I'd still be
 more comfortable promoting better-studied client-side extensions, if
 we have to promote a client-side extension at all.

I don't think we should be relying on extensions per se. Firefogg does 
do some neat things nothing else does, like converting video formats. 
But it's never going to be installed by a larger percentage of our users.

As far as making uploads generally easier, PLupload's approach is way 
more generic since it abstracts away the helper technologies. It will 
work out of the box for maybe 99% of the web and provides a path to 
eventually transitioning to pure JS solutions. It's a really interesting 
approach and the design looks very clean. I wish I'd known about it 
before I started this project.

That said, it went public in early 2010, and a quick visit to its forums 
will show that it's not yet bug-free software either.

Anyway, thanks for the URL. We've gone the free software purist route 
with our uploader, but we may yet learn something from PLuploader or 
incorporate some of what it does.

-- 
Neil Kandalgaonkar  |) ne...@wikimedia.org


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l