Hey guys,

we recently contributed some security improvements to the popular JSF component library PrimeFaces to validate specified accepted content types of uploaded files at server side. Therefore we make use of Java's Files.probeContentType which automatically picks up registered java.nio.file.spi.FileTypeDetector service implementations. The default implementations however primarily check the file extension only by doing registry lookups or something like that. That's insufficient from a security point of view. That's why we will recommend to have Apache Tika in the classpath, more strictly speaking the tika-java7 dependency.

Now there are two questions regarding the use of TikaFileTypeDetector and the required dependencies: 1. Transitive dependencies of tika-java7 are really big (more than 50 megabyte). I know that Apache Tika is not just about file type detection but very much more like meta data extraction that we don't need at all. Would it be okay to exclude tika-parsers which makes up the biggest module without losing file type detection abilities? 2. Unfortunately, TikaFileTypeDetector defaults to perform most efficiently by having included a short circuit if the content type can be guessed from the file extension. This is however insecure since we want to protect our users from tampered file uploads. We always want to use deep (and expensive) content type analysis by looking at the magic bytes or something like that. We currently work around this limitation by explicitly putting .tmp as the file name's extension to have Tika detected application/octet-stream and force it to go ahead. But that's some kind of white box knowledge to rely on which is not that good. Can you somehow provide different implementations of java.nio.file.spi.FileTypeDetector, say "eager" and "lazy"? Please note that we are not allowed to introduce required dependencies, i.e. using Tika directly is not an option.

Here are the related issues: https://github.com/primefaces/primefaces/issues/2791 and https://github.com/primefaces/primefaces/issues/4244 And the pull requests already merged into PrimeFaces 6.3: https://github.com/primefaces/primefaces/pull/4242 and https://github.com/primefaces/primefaces/pull/4249

Thanks for your response in advance

Kind regards, cnsgithub

Reply via email to