Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-30 Thread Marco Schuster
On Tue, Nov 30, 2010 at 8:48 AM, Dmitriy Sintsov ques...@rambler.ru wrote:
 * Bryan Tong Minh bryan.tongm...@gmail.com [Tue, 30 Nov 2010 08:44:43
 +0100]:
 I think that the most recent version should be sufficient. I don't
 think Java would break backwards compatibility: users wouldn't be
 happy if their old jar suddenly stops working on a new JVM.

 Why an outdated and inefficient ZIP format, after all? 7zip is
 incompatible to JVM, should it be a better choice for archive uploads?
 Or, that is too hard to parse on PHP side (I gueses console exec is
 required)?
You can create a zip easily on all major OSes with drag'n'drop.
Windows supports it IIRC from Win 98 SE and up, a standard Linux by
the tools the desktop installs (for KDE, it once was Ark), and MacOS
also delivers ZIP out of the box.
For ZIP, there are even built-in PHP functions to handle it.
7zip is, though open source, requiring third-party plugins, both for
the OS and servers, and 7zip is not really widespread. RAR and ZIP are
the dominant formats in cross-platform data exchange.

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-30 Thread Dmitriy Sintsov
* Marco Schuster ma...@harddisk.is-a-geek.org [Tue, 30 Nov 2010 
11:05:09 +0100]:
 You can create a zip easily on all major OSes with drag'n'drop.
 Windows supports it IIRC from Win 98 SE and up, a standard Linux by
 the tools the desktop installs (for KDE, it once was Ark), and MacOS
 also delivers ZIP out of the box.
 For ZIP, there are even built-in PHP functions to handle it.
 7zip is, though open source, requiring third-party plugins, both for
 the OS and servers, and 7zip is not really widespread. RAR and ZIP are
 the dominant formats in cross-platform data exchange.

There is console version, which might be executed at server side to get 
contents of archive or to analyze it
http://sourceforge.net/projects/p7zip/
MediaWiki already relies on running external executables such as convert 
(ImageMagik) and texvc. I should admit that using ImageMagik for image 
resamping is faster, takes less RAM and gives better results than PHP 
built-in image handling modules (although ImageMagik should also be 
available as PHP module, however not everywhere and increases footprint 
a little bit).
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-30 Thread Dmitriy Sintsov
* Marco Schuster ma...@harddisk.is-a-geek.org [Tue, 30 Nov 2010 
11:05:09 +0100]:
 You can create a zip easily on all major OSes with drag'n'drop.
 Windows supports it IIRC from Win 98 SE and up, a standard Linux by
 the tools the desktop installs (for KDE, it once was Ark), and MacOS
 also delivers ZIP out of the box.
 For ZIP, there are even built-in PHP functions to handle it.
 7zip is, though open source, requiring third-party plugins, both for
 the OS and servers, and 7zip is not really widespread. RAR and ZIP are
 the dominant formats in cross-platform data exchange.

Also, I remember seeing 7z streams recently implemented in 1.17, 
somewhere, already (with external piping, probably)..

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-30 Thread Roan Kattouw
2010/11/30 Dmitriy Sintsov ques...@rambler.ru:
 * Bryan Tong Minh bryan.tongm...@gmail.com [Tue, 30 Nov 2010 08:44:43
 +0100]:
 I think that the most recent version should be sufficient. I don't
 think Java would break backwards compatibility: users wouldn't be
 happy if their old jar suddenly stops working on a new JVM.

 Why an outdated and inefficient ZIP format, after all? 7zip is
 incompatible to JVM, should it be a better choice for archive uploads?
 Or, that is too hard to parse on PHP side (I gueses console exec is
 required)?
We don't necessarily want ZIP uploads at Wikimedia, but it's not
unreasonable to want to upload OpenOffice documents. Since the OO
formats are ZIP-like, blocking ZIPs blocks those too.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-30 Thread K. Peachey
On Tue, Nov 30, 2010 at 9:40 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 We don't necessarily want ZIP uploads at Wikimedia, but it's not
 unreasonable to want to upload OpenOffice documents. Since the OO
 formats are ZIP-like, blocking ZIPs blocks those too.

 Roan Kattouw (Catrope)
Although this feature(/s) should they get implemented in code would
probably be wanted more than just at WMF and we shouldn't focus
discussion on features such as this a Yes or No just because it's
something the foundation may or may not want.
-Peachey

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-29 Thread Erik Moeller
2010/11/25 bawolff bawolff...@gmail.com:
 Personally I think it would be nicer if you could associate source
 files with the final files.

Yeah, this was discussed a bit earlier in this thread. As far as I can
tell, that approach adds a fair degree of complexity (requirement of
tracking a whole new class of files in association with other files,
including versioning, deletion, etc.). It also seems to presume that
you'd never want to reference those same files using standard
MediaWiki links. It's not clear to me that such a system has clear
advantages over using normal wiki-links to source files from
appropriate places.

Stepping back a bit, I did a bit more research over the weekend as to
the current state of sourcing in Wikimedia Commons, and which file
types would be the most important to support.

Generally speaking, there's an existing (albeit limited) practice of
adding sources that can be represented as simple plain-text files,
such as POV-Ray, Gnuplot, etc. Sometimes these are formatted using the
syntax-highlighting extension, sometimes not. This practice could be
made more formal by directly requesting that users add source data
when they specify that a file has been created using one of these
applications (which is often identified using Created with
templates). But I don't necessarily see that any additional software
support is needed for these formats, save perhaps easier
downloadability, which could be added to the syntax-highlighting
extension.

For binary formats (and perhaps complex XML-based formats), the
following stand out as being of high significance:

* .blend as Blender's native export format and COLLADA as an open
interchange format
* .xcf as Gimp's native format (preserving layers and other
meta-information for bitmap images)
* .scribus as Scribus' native format (XML, but files can get very
large + have dependencies)
* .odt, .odp, .od as OpenDocument formats
* potentially OpenEXR and some other open interchange formats.

As far as I understand the pure security (as opposed to content)
concerns, these fall primarily into these categories:

* client-side execution of unsafe formats using designated
applications (embedded macros, references to other malicious content
etc.)
* exploitation of browser in-line display for purposes of XSS attacks or similar

Let me know if I'm missing a large category. I'm assuming server-side
execution is not an issue for Wikimedia given correct server
configuration.

Full security for these and other conceivably useful binary formats
seems difficult to obtain to me (that is, making sure that nothing bad
ever runs on a user's computer if they open a file). The restricted
upload (or restricted attachment) approach builds on social trust to
complement technical verification methods. We'd still have to invent
some additional machinery to implement security warnings before ever
exposing such files directly to the user.

Sacrificing easy individual file manageability, I wonder if it
wouldn't be most straightforward to write a decent ZIP handler (with
directory display, and thumbnailing of included images, for purposes
of patrolling), to disallow ZIP files that contain non-whitelisted
filetypes, and to use ZIPs as the container for all complex,
free-format source uploads. [[File:Bla source.zip]] could then just be
referenced as part of the file description pages where relevant.
Because some of the aforementioned binary formats are effectively
archives, some of this work would likely be necessary anyway.

That said, I'm not wedded to any particular approach. I hope we can
identify reasonably simple steps that we can take to significantly
expand our support for source files in the near term, because such
files are essential for re-use.
-- 
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-29 Thread Roan Kattouw
2010/11/29 Erik Moeller e...@wikimedia.org:
 As far as I understand the pure security (as opposed to content)
 concerns, these fall primarily into these categories:

 * client-side execution of unsafe formats using designated
 applications (embedded macros, references to other malicious content
 etc.)
 * exploitation of browser in-line display for purposes of XSS attacks or 
 similar

 Let me know if I'm missing a large category. I'm assuming server-side
 execution is not an issue for Wikimedia given correct server
 configuration.

Server side execution is not an issue, no.

The client-side issues can all be reduced to a file acting as type A
to MediaWiki and as type B to the victim, where A is some harmless
file type we'd like to allow users to upload and B is some potentially
dangerous file type. This is usually enabled by one or more of the
following factors:
* IE second-guesses the server-provided MIME type in favor of its own
brain-dead MIME type detection algorithm, which in particular is
extremely eager to treat things as HTML (causing any embedded JS to be
executed): the presence of certain HTML tags or tag-like strings in
the first 255 bytes is sufficient reason for IE to call something HTML
* File formats are often interpreted flexibly, so a file that doesn't
conform to the standard completely may be read just fine by most
applications. These flexibilities allow for creating a file that looks
like an A but also comes close enough to being a B. For example,
running an HTML page containing unified diff text in the middle
through patch(1) will usually work, because patch(1) discards
garbage before and after the diff. These flexibilities are usually
undocumented and vary between applications, so it can be difficult to
predict whether a file qualifies as almost a B
* Some file formats are designed in such a way that a file can
actually be a completely valid A *and* a completely valid B all at the
same time. This is the case for most ZIP and ZIP-like formats

To illustrate the last sentence of the second bullet point, I'll quote
Tim's blog post on upload security [1] (which is a fun read for anyone
even mildly interested in the topic). It's part of the section on the
GIFAR vulnerability, which involves a file that's a valid GIF or ZIP
file, but which Java happily executes as a JAR (a ZIP-like format for
executable Java bytecode) file because Java's JAR format validation is
extremely lax, almost nonexistent. The only validation is does do is
check for a certain magic number at the end of the file, so rejecting

An alternative [to rejecting all ZIP files] would be to parse the
entire zip directory and to reject any archives that contain a file
with a .class extension. I can’t vouch for this method. **If you did
this, the zip library you used would have to be exactly as tolerant of
zip format errors as the one used by Java.** It would probably be best
to actually shell out to Java to do the test.

(emphasis mine)

Roan Kattouw (Catrope)

[1] http://tstarling.com/blog/2008/12/secure-web-uploads/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-29 Thread Platonides
Roan Kattouw wrote:
 An alternative [to rejecting all ZIP files] would be to parse the
 entire zip directory and to reject any archives that contain a file
 with a .class extension. I can’t vouch for this method. **If you did
 this, the zip library you used would have to be exactly as tolerant of
 zip format errors as the one used by Java.** It would probably be best
 to actually shell out to Java to do the test.
 
 (emphasis mine)

If we consider acceptable the perfomance of parsing full zip files (as
opposed to just 512 bytes or the central directory), we can quite easily
accept many zip files.

There's also the issue of jar protocol, but that seems fixed from
Firefox 2.0.0.10 so probably not worth taking into account.
http://kb.mozillazine.org/Network.jar.open-unsafe-types


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-29 Thread Bryan Tong Minh
On Mon, Nov 29, 2010 at 9:29 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 An alternative [to rejecting all ZIP files] would be to parse the
 entire zip directory and to reject any archives that contain a file
 with a .class extension. I can’t vouch for this method. **If you did
 this, the zip library you used would have to be exactly as tolerant of
 zip format errors as the one used by Java.** It would probably be best
 to actually shell out to Java to do the test.


I was thinking about this. There appears to be no option to the java
command line client to only check a file without executing. An option
would be to invoke the java debugger (jdb), which initially breaks at
the first instruction and presumably fails if the file is not a valid
jar. Still sounds nasty though, plus the fact that jdb is not a
generally installed program.


Bryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-29 Thread Platonides
Bryan Tong Minh wrote:
 On Mon, Nov 29, 2010 at 9:29 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 An alternative [to rejecting all ZIP files] would be to parse the
 entire zip directory and to reject any archives that contain a file
 with a .class extension. I can’t vouch for this method. **If you did
 this, the zip library you used would have to be exactly as tolerant of
 zip format errors as the one used by Java.** It would probably be best
 to actually shell out to Java to do the test.

 
 I was thinking about this. There appears to be no option to the java
 command line client to only check a file without executing. An option
 would be to invoke the java debugger (jdb), which initially breaks at
 the first instruction and presumably fails if the file is not a valid
 jar. Still sounds nasty though, plus the fact that jdb is not a
 generally installed program.
 
 
 Bryan

Note that you can't simply check (or reverse-engineer) that JVM X
doesn't treat it as a jar, since it could be detected in X-1 or X+1.
So there should be a range of still in use JVMs to assert.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-29 Thread Dmitriy Sintsov
* Bryan Tong Minh bryan.tongm...@gmail.com [Tue, 30 Nov 2010 08:44:43 
+0100]:
 I think that the most recent version should be sufficient. I don't
 think Java would break backwards compatibility: users wouldn't be
 happy if their old jar suddenly stops working on a new JVM.

Why an outdated and inefficient ZIP format, after all? 7zip is 
incompatible to JVM, should it be a better choice for archive uploads? 
Or, that is too hard to parse on PHP side (I gueses console exec is 
required)?
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-25 Thread David Gerard
On 25 November 2010 07:58, Bryan Tong Minh bryan.tongm...@gmail.com wrote:

 I think you are taking the wrong approach here, altough I agree with
 MZMcBride's reply to your mail From a social and technical
 perspective, this proposal is horribly hackish. [...] Given the
 current parameters, this is probably the best solution. [...]


The rock and hard place here are:

1. This solution is horribly hacky and bletcherous.
2. The ideal is the enemy of the actually adequate; at present things
are not adequate.

Do we have a clear picture of what the ideal looks like? Are the hacks
clearly on the path to that and not to obstruct it in any way?


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-25 Thread Platonides
Erik Moeller wrote:
 [Kicking this thread back to life, full-quoting below only for quick 
 reference.]
 
 I've collected some additional notes on this here:
 http://commons.wikimedia.org/wiki/Commons:Restricted_uploads
 
 Would appreciate feedback  will circulate further in the Commons community.
 
 Thanks,
 Erik

How do you expect the end users to send ? Uploading to a service like
megaupload? As email attachments? Via OTRS? Using a toolserver app?

Seems a use case for the upload stash. Allow the users to upload the
file, but require approval until it is finally publicly shown.
We could even show the files publically, as far as there's no direct
download, requiring downloaders to provide a session token in the process.

In any case, files treated as html by IE would still need to be disallowed.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-25 Thread bawolff
 Message: 5
 Date: Wed, 24 Nov 2010 15:46:24 -0800
 From: Erik Moeller e...@wikimedia.org
 Subject: Re: [Wikitech-l] Commons ZIP file upload for admins
 To: Wikimedia developers wikitech-l@lists.wikimedia.org
 Message-ID:
       aanlktimd7kxngs4azgpanr_84ok_th9t1dsanc7st...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 [Kicking this thread back to life, full-quoting below only for quick 
 reference.]

 I've collected some additional notes on this here:
 http://commons.wikimedia.org/wiki/Commons:Restricted_uploads

 Would appreciate feedback  will circulate further in the Commons community.

 Thanks,
 Erik

Personally I think it would be nicer if you could associate source
files with the final files.
Something like:
*User uploads jpeg of 3D image (or whatever)
*on the image description page for the jpg, there is an upload
source file link
*Users (who have appropriate permissions) can upload the associated
source files with this link.
*These source files might appear as a subpage of the primary
image/document/media, or they might just appear in list form at the
bottom of the image description page of the main image/media. Either
way, the source files would be associated with a single main file.

Doing it this way would limit the feature to source files of actually
uploaded files (so less random cruft lying around, no orphaned source
files, less chance of people abusing the feature to get around file
type restrictions). I also personally don't like the idea of uploading
archives. Instead I think it would be better just to upload all the
source files needed. (although that might fall apart if you're
uploading source files for something very complex which has many
source files in a specific directory structure). There could also be a
download all option where all the source files get tar'ed together on
the server side for an easy download.

-bawolff

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-24 Thread Erik Moeller
[Kicking this thread back to life, full-quoting below only for quick reference.]

I've collected some additional notes on this here:
http://commons.wikimedia.org/wiki/Commons:Restricted_uploads

Would appreciate feedback  will circulate further in the Commons community.

Thanks,
Erik


2010/10/25 Erik Moeller e...@wikimedia.org:
 2010/10/25 Brion Vibber br...@pobox.com:
 In all cases we have the worry that if we allow uploading those funky
 formats, we'll either a) end up with malicious files or b) end up with lazy
 people using and uploading non-free editing formats when we'd prefer them to
 use freely editable formats. I'm not sure I like the idea of using admin
 powers to control being able to upload those, though; bottlenecking content
 reviews as a strict requirement can be problematic on its own.

 Yeah, I don't like the bottleneck approach either, but in the absence
 of better systems, it may be the best way to go as an immediate
 solution. We could do it for a list of whitelisted open formats that
 are requested by the community. And we'd see from usage which file
 types we need to prioritize proper support/security checks for.

 What I'd probably like to see is a more wide-open allowal of arbitrary
 'source files' which can be uploaded as attachments to standalone files. We
 could give them more limited access: download only, no inline viewing, only
 allowed if DLs are on separate safe domain, etc.

 It seems fairly straightforward to me to say: These free file formats
 are permitted to be uploaded. We haven't developed fully sophisticated
 security checks for them yet, so we're asking trusted users to do
 basic sanity checks until we've developed automatic checks. We can
 then prod people to convert any proprietary formats into free ones
 that are on that whitelist. And if they're free formats, I'm not sure
 why they shouldn't be first-class citizens -- as Michael mentioned,
 that makes it possible to plop in custom handlers at a later time. A
 COLLADA handler for 3D files may seem like a remote possibility, but
 it's certainly within the realm of sanity. ZIP files would have to be
 specially treated so they're only allowed if they contain only files
 in permitted formats.

 So, consistent with Michael's suggestion, we could define a
 'restricted-upload' right, initially given to admins only but possibly
 expanded to other users, which would allow files from the potentially
 insecure list of extensions to be uploaded, and for ZIP files, would
 ensure that only accepted file types are contained within the archive.
 The resultant review bottleneck would simply be a reflection that we
 haven't gotten around to adding proper support for these file types
 yet. On the plus side, we could add restricted upload support for new
 open formats as soon as there's consensus to do so.

 The main downside I would see is that users might end up being
 confused why these files get uploaded. To mitigate this, we could add
 a This file has a restricted filetype. Files of this type can
 currently only be uploaded by administrators for security reasons
 note on file description pages.
 --
 Erik Möller
 Deputy Director, Wikimedia Foundation

 Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate




-- 
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-24 Thread MZMcBride
Erik Moeller wrote:
 I've collected some additional notes on this here:
 http://commons.wikimedia.org/wiki/Commons:Restricted_uploads
 
 Would appreciate feedback  will circulate further in the Commons community.

From a social and technical perspective, this proposal is horribly hackish.
The over-arching goal should be to implement fewer hacks, though we
obviously don't live in an ideal world.

Given the current parameters, this is probably the best solution. However,
there needs to be a more in-depth analysis of the potential security
implications of some of these file types. Even trusted users shouldn't be
able to upload files that allow for the arbitrary injection of PHP, for
example. I suppose that's why you're asking for more feedback from
wikitech-l.

The current proposal is vague about which specific file types are desired. A
concrete list ought to be generated so that people can research the known
security implications of allowing those file types to uploaded.

I don't think there is ever going to be (or ever should be) a generic
whitelist to allow any and all free/open file types. What are the specific
file types that are currently banned that you're seeking to have partially
unbanned?

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-24 Thread Bryan Tong Minh
On Thu, Nov 25, 2010 at 12:46 AM, Erik Moeller e...@wikimedia.org wrote:
 [Kicking this thread back to life, full-quoting below only for quick 
 reference.]

 I've collected some additional notes on this here:
 http://commons.wikimedia.org/wiki/Commons:Restricted_uploads

 Would appreciate feedback  will circulate further in the Commons community.


I think you are taking the wrong approach here, altough I agree with
MZMcBride's reply to your mail From a social and technical
perspective, this proposal is horribly hackish. [...] Given the
current parameters, this is probably the best solution. [...]

I believe that we should really be aiming for scanning for security
vulnerabilities and reject only those files that pose a vulnerability.
For example, we do now outright reject open office files, as they may
encapsulate files that will be executed by the JVM. We should be able
to determine the exact circumstances that pose a vulnerability and
only reject those files, similar to what we have done for the embedded
HTML in files that affects IE.


Bryan

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-26 Thread Maciej Jaros
@2010-10-26 03:45, Erik Moeller:
 2010/10/25 Brion Vibberbr...@pobox.com:
 In all cases we have the worry that if we allow uploading those funky
 formats, we'll either a) end up with malicious files or b) end up with lazy
 people using and uploading non-free editing formats when we'd prefer them to
 use freely editable formats. I'm not sure I like the idea of using admin
 powers to control being able to upload those, though; bottlenecking content
 reviews as a strict requirement can be problematic on its own.
 Yeah, I don't like the bottleneck approach either, but in the absence
 of better systems, it may be the best way to go as an immediate
 solution. We could do it for a list of whitelisted open formats that
 are requested by the community. And we'd see from usage which file
 types we need to prioritize proper support/security checks for.

 What I'd probably like to see is a more wide-open allowal of arbitrary
 'source files' which can be uploaded as attachments to standalone files. We
 could give them more limited access: download only, no inline viewing, only
 allowed if DLs are on separate safe domain, etc.
 It seems fairly straightforward to me to say: These free file formats
 are permitted to be uploaded. We haven't developed fully sophisticated
 security checks for them yet, so we're asking trusted users to do
 basic sanity checks until we've developed automatic checks. We can
 then prod people to convert any proprietary formats into free ones
 that are on that whitelist. And if they're free formats, I'm not sure
 why they shouldn't be first-class citizens -- as Michael mentioned,
 that makes it possible to plop in custom handlers at a later time. A
 COLLADA handler for 3D files may seem like a remote possibility, but
 it's certainly within the realm of sanity. ZIP files would have to be
 specially treated so they're only allowed if they contain only files
 in permitted formats.

 So, consistent with Michael's suggestion, we could define a
 'restricted-upload' right, initially given to admins only but possibly
 expanded to other users, which would allow files from the potentially
 insecure list of extensions to be uploaded, and for ZIP files, would
 ensure that only accepted file types are contained within the archive.
 The resultant review bottleneck would simply be a reflection that we
 haven't gotten around to adding proper support for these file types
 yet. On the plus side, we could add restricted upload support for new
 open formats as soon as there's consensus to do so.

 The main downside I would see is that users might end up being
 confused why these files get uploaded. To mitigate this, we could add
 a This file has a restricted filetype. Files of this type can
 currently only be uploaded by administrators for security reasons
 note on file description pages.

ODS, ODT and such should be fairly easy to check at least on a basic 
level. A very basic check would be to check if it contains Basic or 
Scripts folder. Bit more advanced would be to check if manifest.xml 
contains application/binary (to check if anyone tried to change 
default naming) and check if any file contains script:module (for the 
same reason).
If any of this would be true than there should be a warning.

I think we should also support Dia for diagrams and XCF for layered 
bitmaps. Don't know much about XCF, but Dia is a simple XML file (which 
might be zipped) and so shouldn't be dangerous at all. I guess it could 
even be unzipped upon loading because Dia supports both zipped and 
unzipped versions alike. There is/was also Extension:Dia which generates 
thumbnails... It seems to work fine even with 1.16 from the trunk and 
the latest Dia version. It doesn't work with zipped Dia files but this 
would be manageable.

Regards,
Nux.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-26 Thread John Vandenberg
On Tue, Oct 26, 2010 at 6:50 AM, Max Semenik maxsem.w...@gmail.com wrote:
 

 Instead of amassing social constructs around technical deficiency, I
 propose to fix bug 24230 [1] by implementing proper checking for JAR
 format. Also, we need to check all contents with antivirus and
 disallow certain types of files inside archives (such as .exe). Once
 we took all these precautions, I see no need to restrict ZIPs to any
 special group. Of course, this doesn't mean that we soul allow all the
 safe ZIPs, just several open ZIP-based file formats.

If we only want zip's for several formats, we should check that they
are of the expected type, _and_ that they consist of open file formats
within the zip.

e.g. Open Office XML (the MS format) can include binary files for OLE
objects and fonts (I think)

see Table 2. Content types in a ZIP container

http://msdn.microsoft.com/en-us/library/aa338205(office.12).aspx

OOXML can also include any other mimetype, which are registered
_within_ the zip, and linked into the main content file.

afaics, allowing only safe zip to be upload isn't difficult.

Expand the zip, and reject any zip which contains files on
$wgFileBlacklist, and not on $wgFileExtensions + $wgZipFileExtensions.

$wgZipFileExtensions would consist of array('xml')

Then check the mimetypes of the files in the zip, against
$wgMimeTypeBlacklist (with 'application/zip' removed), again allowing
desired XML mimetypes through.

--
John Vandenberg

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Erik Moeller
Hello all,

for some types of resources, it's desirable to upload source files
(whether it's Blender, COLLADA, Scribus, EDL, or some other format),
so that others can more easily remix and process them. Currently, as
far as I know, there's no way to upload these resources to Commons.

What would be the arguments against allowing administrators to upload
arbitrary ZIP files on Wikimedia Commons, allowing the Commons
community to develop policy and process around when such archived
resources are appropriate? An alternative, of course, would be to
whitelist every possible source format for admins, but it seems to me
that it would be a good general policy to not enable additional
support for formats that aren't officially supported (reduces
confusion among users about what's permitted -- there's only one file
format they can't use).

Thoughts?

Thanks,
Erik

-- 
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Max Semenik
On 25.10.2010, 23:02 Erik wrote:

 Hello all,

 for some types of resources, it's desirable to upload source files
 (whether it's Blender, COLLADA, Scribus, EDL, or some other format),
 so that others can more easily remix and process them. Currently, as
 far as I know, there's no way to upload these resources to Commons.

 What would be the arguments against allowing administrators to upload
 arbitrary ZIP files on Wikimedia Commons, allowing the Commons
 community to develop policy and process around when such archived
 resources are appropriate? An alternative, of course, would be to
 whitelist every possible source format for admins, but it seems to me
 that it would be a good general policy to not enable additional
 support for formats that aren't officially supported (reduces
 confusion among users about what's permitted -- there's only one file
 format they can't use).

 Thoughts?

Instead of amassing social constructs around technical deficiency, I
propose to fix bug 24230 [1] by implementing proper checking for JAR
format. Also, we need to check all contents with antivirus and
disallow certain types of files inside archives (such as .exe). Once
we took all these precautions, I see no need to restrict ZIPs to any
special group. Of course, this doesn't mean that we soul allow all the
safe ZIPs, just several open ZIP-based file formats.

-
[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=24230

-- 
Best regards,
  Max Semenik ([[User:MaxSem]])


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Michael Dale
On 10/25/2010 12:02 PM, Erik Moeller wrote:
 Hello all,

 for some types of resources, it's desirable to upload source files
 (whether it's Blender, COLLADA, Scribus, EDL, or some other format),
 so that others can more easily remix and process them. Currently, as
 far as I know, there's no way to upload these resources to Commons.

 What would be the arguments against allowing administrators to upload
 arbitrary ZIP files on Wikimedia Commons, allowing the Commons
 community to develop policy and process around when such archived
 resources are appropriate? An alternative, of course, would be to
 whitelist every possible source format for admins, but it seems to me
 that it would be a good general policy to not enable additional
 support for formats that aren't officially supported (reduces
 confusion among users about what's permitted -- there's only one file
 format they can't use).

 Thoughts?

 Thanks,
 Erik



Its most ideal if we actually support these formats, so we can do thing 
like thumbnails, basic meta data etc. Failing that its better to support 
a given file extension, then it is to support zip files. This way if in 
'the future' we add support for X file format, then we have X format 
files stored consistently so we can support representation of that file 
format.

If we add blanket support for 'throw whatever you want' into a zip file, 
it will be difficult to give a quality representation of that asset in 
the future. ( other than as a zip file with multiple sub assets ).

If for example someone writes a diff engine for representing 3d model 
transformations, we won't as easily be able to plug-in that tool, if we 
don't have a consistent storage model for that file format.

That being said their may be some composite asset sets that lack 
container systems, in which case it would not be bad support some open 
container format.

The number of formats or multimedia asset compositing systems that are 
not web representable with JavaScript engines or natively supported in 
the browser should be on a dramatic decline in the next decade, so best 
to just focus on support for such formats.

For example we prefer svg uploads to a zip file with an illustrator 
assets, because svg is representable in the browser, there are 
javascript based engines for editing svg 
[http://svg-edit.googlecode.com/svn/branches/2.4/editor/svg-editor.html] 
etc. Likewise for 3d model representation with the COLLADA format, 
(although much more in its infancy at this point in time. )

--michael


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Aryeh Gregor
On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik maxsem.w...@gmail.com wrote:
 Instead of amassing social constructs around technical deficiency, I
 propose to fix bug 24230 [1] by implementing proper checking for JAR
 format.

Does that bug even affect Wikimedia?  We have uploads segregated on
their own domain, where we don't set cookies or do anything else
interesting, so what would an uploaded JAR file even do?  If that kind
of attack is still a problem even with separate domains, we can do
like Mozilla's Bugzilla and serve each uploaded file from its own
unique domain (that would have ramifications for how browsers fetch
the images, but they might be positive anyway).

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Platonides
Aryeh Gregor wrote:
 On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik maxsem.w...@gmail.com wrote:
 Instead of amassing social constructs around technical deficiency, I
 propose to fix bug 24230 [1] by implementing proper checking for JAR
 format.
 
 Does that bug even affect Wikimedia?  We have uploads segregated on
 their own domain, where we don't set cookies or do anything else
 interesting, so what would an uploaded JAR file even do?  If that kind
 of attack is still a problem even with separate domains, we can do
 like Mozilla's Bugzilla and serve each uploaded file from its own
 unique domain (that would have ramifications for how browsers fetch
 the images, but they might be positive anyway).

Well, the fact that a would not be able to steal the cookies if they
could place a jar file there* doesn't mean a malicious applet there
isn't bad.

*Not sure if we can really assert that. Most likely it varies depending
on browser, JVM and version.

Doing a full ZIP exploration against java classes is simple. However, we
should check that everything there is clean, not that nothing there is
blacklisted.

Archive formats have its own can of of issues. We don't want people to
upload a OASIS file that contains a videogame, even if it's not a jar
or a virus. How to determine if a file should be in the archive or not?
What to do with archived archives?


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Marco Schuster
On Mon, Oct 25, 2010 at 10:09 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik maxsem.w...@gmail.com wrote:
 Instead of amassing social constructs around technical deficiency, I
 propose to fix bug 24230 [1] by implementing proper checking for JAR
 format.

 Does that bug even affect Wikimedia?  We have uploads segregated on
 their own domain, where we don't set cookies or do anything else
 interesting, so what would an uploaded JAR file even do?
upload.wikimedia.org could end up on Google's Safe Surfing (or however
it's called) blacklist for hosting malicious .jar's which are injected
on another pwned web site or loaded through pwned advertising brokers.
Given the fact that Java is the 2nd biggest exploit vector in terms of
exploits (but 1st in terms of impact - users don't update Java as
often as the Adobe Reader), it should not be allowed to upload JARs
(or things that look like something else, but infact can be loaded and
executed by the JRT) to Wikipedia.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Martijn Hoekstra
On Mon, Oct 25, 2010 at 10:51 PM, Marco Schuster
ma...@harddisk.is-a-geek.org wrote:
 On Mon, Oct 25, 2010 at 10:09 PM, Aryeh Gregor
 simetrical+wikil...@gmail.com wrote:
 On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik maxsem.w...@gmail.com wrote:
 Instead of amassing social constructs around technical deficiency, I
 propose to fix bug 24230 [1] by implementing proper checking for JAR
 format.

 Does that bug even affect Wikimedia?  We have uploads segregated on
 their own domain, where we don't set cookies or do anything else
 interesting, so what would an uploaded JAR file even do?
 upload.wikimedia.org could end up on Google's Safe Surfing (or however
 it's called) blacklist for hosting malicious .jar's which are injected
 on another pwned web site or loaded through pwned advertising brokers.
 Given the fact that Java is the 2nd biggest exploit vector in terms of
 exploits (but 1st in terms of impact - users don't update Java as
 often as the Adobe Reader), it should not be allowed to upload JARs
 (or things that look like something else, but infact can be loaded and
 executed by the JRT) to Wikipedia.

 Marco
 --
 VMSoft GbR
 Nabburger Str. 15
 81737 München
 Geschäftsführer: Marco Schuster, Volker Hemmert
 http://vmsoft-gbr.de

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Should we also be exploring any possibly malicious archives inside
archives recursively, or is just making sure the archive itself is
good is good enough?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Platonides
Martijn Hoekstra wrote:
 Should we also be exploring any possibly malicious archives inside
 archives recursively, or is just making sure the archive itself is
 good is good enough?

I think that we should block such files.
Also note that we can't recursively analyse everything since that would
allow to DoS us.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Brion Vibber
On Mon, Oct 25, 2010 at 1:05 PM, Michael Dale md...@wikimedia.org wrote:

 Its most ideal if we actually support these formats, so we can do thing
 like thumbnails, basic meta data etc. Failing that its better to support
 a given file extension, then it is to support zip files. This way if in
 'the future' we add support for X file format, then we have X format
 files stored consistently so we can support representation of that file
 format.

 If we add blanket support for 'throw whatever you want' into a zip file,
 it will be difficult to give a quality representation of that asset in
 the future. ( other than as a zip file with multiple sub assets ).


I tend to agree that it's preferable to be able to recognize and validate
formats; though as noted sometimes you're going to have stuff that doesn't
really fit well in an individual file.

Certainly for Wikibooks I could envision *all sorts* of totally legitimate
use for being able to upload/download various files, including archives. The
Blender handbook could use example files and projects to download, which
might include dozens of support files. A programming module might need to
provide source code and sample input files.

Then we have the 'media source file' case: an animation should be able to
include the Blender or POV-Ray or whatever sources that were used to create
it. A pretty picture built in a layered raster system like Gimp or Photoshop
would do better to include the source .xcf or .psd than not too, even if the
source file is in a format that's harder to work with.

I believe we've got an old bug on the idea of being to explicitly attach a
source file:
https://bugzilla.wikimedia.org/show_bug.cgi?id=17012


In all cases we have the worry that if we allow uploading those funky
formats, we'll either a) end up with malicious files or b) end up with lazy
people using and uploading non-free editing formats when we'd prefer them to
use freely editable formats. I'm not sure I like the idea of using admin
powers to control being able to upload those, though; bottlenecking content
reviews as a strict requirement can be problematic on its own.

What I'd probably like to see is a more wide-open allowal of arbitrary
'source files' which can be uploaded as attachments to standalone files. We
could give them more limited access: download only, no inline viewing, only
allowed if DLs are on separate safe domain, etc.

I don't really relish the thought of checking image source data for warez
archives, though. :) Can't guarantee a magic solution there.

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l