Bug#898822: [RFC] Detect data embeded image in html like file

2018-05-16 Thread Bastien ROUCARIES
On Wed, May 16, 2018 at 4:00 PM, Bastien ROUCARIES
 wrote:
> On Wed, May 16, 2018 at 11:33 AM, Chris Lamb  wrote:
>> retitle 898822 Detect data encoded/embedded in HTML "Data" URI schemes
>> severity 898822 wishlist
>> tags 898822 + moreinfo
>> thanks
>>
>> Hi Bastien,
>>
>> [..]
>>
>> I think some concrete examples here would be useful in triaging/
>> prioritising this, as well as working out whether it is feasible or
>> sensible :)
> Code search with request
> (https://codesearch.debian.net/search?q=src%3D%22data%3A=1=1)
> give 75 packages affected:
> asciidoctor
> cacti
> chemical-structures
> chromium-browser
> ckeditor
> classified-ads
> diffoscope
> edbrowse
> firefox
> firefox-esr
> fontforge
> fossil
> gitinspector
> golang-github-microcosm-cc-bluemonday
> html5lib
> icingaweb2
> ikiwiki
> ipython
> jmol
> juli
> kmplayer
> kopano-webapp
> landslide
> libcgi-application-plugin-dbiprofile-perl
> libxml-atom-fromowl-perl
> libxml-atom-owl-perl
> lua-apr
> matplotlib
> mayavi2
> mediawiki
> nbconvert
> node-normalize.css
> notmuch
> oca-core
> openlp
> opennebula
> openscad
> pandoc
> php-doctrine-bundle
> php-getid3
> php-kdyby-events
> phpmyadmin
> python-cartopy
> python-darkslide
> python-mne
> python-pweave
> python-pydub
> python-pyqrcode
> python-qtconsole
> qtwebengine-opensource-src
> rails
> rapid-photo-downloader
> r-cran-knitr
> r-cran-repr
> r-cran-rmarkdown
> rdkit
> request-tracker4
> roundcube
> rss-bridge
> rubocop
> sagemath
> sass-spec
> simplesamlphp
> spip
> sympa
> thunderbird
> trac
> turbogears2-doc
> veusz
> virtuoso-opensource
> vistrails
> woo
> xhtml2pdf
> yt
> zotero-standalone-build
>
> Some are clearly abuse see:
> 1. 
> https://sources.debian.org/src/chemical-structures/2.2.dfsg.0-12/debian/patches/privacy.patch/?hl=10#L10
> (render package undistributable one of sourceforge logo)
> 2. 
> https://codesearch.debian.net/show?file=lua-apr_0.23.2.dfsg-4%2Fsrc%2Fbase64.c=33
> FTBFS not prefered modification source
> 3. 
> https://sources.debian.org/src/rubocop/0.52.1+dfsg-1/debian/patches/04-adjust-tests-due-to-rubocop-logo-removal-from-package.diff/?hl=25#L25
> (remove logo as file not as included base64 => RC undistributable)
> 4.https://sources.debian.org/src/fontforge/1:20170731%7Edfsg-1/debian/patches/2003_avoid_privacy_breach.patch/?hl=59#L59
> Border line could use the same trick that I have done in
> libjs-normalize.css to generate with js the image (not prefered source
> of modification)
>
> I have not checked all the package.
>
> another risk is to carry forbidden image like porn of think like this
> is this stuff. I prefer lintian to signal pedantically in order to
> manually check acceptance.
>
> Better safe than sorry

This request is also interesting:
https://codesearch.debian.net/search?q=href%3D%22data%3A=1=1

>
> Bastien
>
>
>>
>> Best wishes,
>>
>> --
>>   ,''`.
>>  : :'  : Chris Lamb
>>  `. `'`  la...@debian.org / chris-lamb.co.uk
>>`-



Bug#898822: [RFC] Detect data embeded image in html like file

2018-05-16 Thread Bastien ROUCARIES
On Wed, May 16, 2018 at 11:33 AM, Chris Lamb  wrote:
> retitle 898822 Detect data encoded/embedded in HTML "Data" URI schemes
> severity 898822 wishlist
> tags 898822 + moreinfo
> thanks
>
> Hi Bastien,
>
> [..]
>
> I think some concrete examples here would be useful in triaging/
> prioritising this, as well as working out whether it is feasible or
> sensible :)
Code search with request
(https://codesearch.debian.net/search?q=src%3D%22data%3A=1=1)
give 75 packages affected:
asciidoctor
cacti
chemical-structures
chromium-browser
ckeditor
classified-ads
diffoscope
edbrowse
firefox
firefox-esr
fontforge
fossil
gitinspector
golang-github-microcosm-cc-bluemonday
html5lib
icingaweb2
ikiwiki
ipython
jmol
julia
kmplayer
kopano-webapp
landslide
libcgi-application-plugin-dbiprofile-perl
libxml-atom-fromowl-perl
libxml-atom-owl-perl
lua-apr
matplotlib
mayavi2
mediawiki
nbconvert
node-normalize.css
notmuch
oca-core
openlp
opennebula
openscad
pandoc
php-doctrine-bundle
php-getid3
php-kdyby-events
phpmyadmin
python-cartopy
python-darkslide
python-mne
python-pweave
python-pydub
python-pyqrcode
python-qtconsole
qtwebengine-opensource-src
rails
rapid-photo-downloader
r-cran-knitr
r-cran-repr
r-cran-rmarkdown
rdkit
request-tracker4
roundcube
rss-bridge
rubocop
sagemath
sass-spec
simplesamlphp
spip
sympa
thunderbird
trac
turbogears2-doc
veusz
virtuoso-opensource
vistrails
woo
xhtml2pdf
yt
zotero-standalone-build

Some are clearly abuse see:
1. 
https://sources.debian.org/src/chemical-structures/2.2.dfsg.0-12/debian/patches/privacy.patch/?hl=10#L10
(render package undistributable one of sourceforge logo)
2. 
https://codesearch.debian.net/show?file=lua-apr_0.23.2.dfsg-4%2Fsrc%2Fbase64.c=33
FTBFS not prefered modification source
3. 
https://sources.debian.org/src/rubocop/0.52.1+dfsg-1/debian/patches/04-adjust-tests-due-to-rubocop-logo-removal-from-package.diff/?hl=25#L25
(remove logo as file not as included base64 => RC undistributable)
4.https://sources.debian.org/src/fontforge/1:20170731%7Edfsg-1/debian/patches/2003_avoid_privacy_breach.patch/?hl=59#L59
Border line could use the same trick that I have done in
libjs-normalize.css to generate with js the image (not prefered source
of modification)

I have not checked all the package.

another risk is to carry forbidden image like porn of think like this
is this stuff. I prefer lintian to signal pedantically in order to
manually check acceptance.

Better safe than sorry

Bastien


>
> Best wishes,
>
> --
>   ,''`.
>  : :'  : Chris Lamb
>  `. `'`  la...@debian.org / chris-lamb.co.uk
>`-



Processed: Re: Bug#898822: [RFC] Detect data embeded image in html like file

2018-05-16 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> retitle 898822 Detect data encoded/embedded in HTML "Data" URI schemes
Bug #898822 [lintian] [RFC] Detect data embeded image in html like file
Changed Bug title to 'Detect data encoded/embedded in HTML "Data" URI schemes' 
from '[RFC] Detect data embeded image in html like file'.
> severity 898822 wishlist
Bug #898822 [lintian] Detect data encoded/embedded in HTML "Data" URI schemes
Severity set to 'wishlist' from 'minor'
> tags 898822 + moreinfo
Bug #898822 [lintian] Detect data encoded/embedded in HTML "Data" URI schemes
Added tag(s) moreinfo.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
898822: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898822
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#898822: [RFC] Detect data embeded image in html like file

2018-05-16 Thread Chris Lamb
retitle 898822 Detect data encoded/embedded in HTML "Data" URI schemes
severity 898822 wishlist
tags 898822 + moreinfo
thanks 

Hi Bastien,

[..]

I think some concrete examples here would be useful in triaging/
prioritising this, as well as working out whether it is feasible or
sensible :)


Best wishes,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org / chris-lamb.co.uk
   `-



Bug#898822: [RFC] Detect data embeded image in html like file

2018-05-16 Thread Bastien ROUCARIES
Package: lintian
Version: 2.5.86
Severity: minor


Hi,


This is maybe a hot topic, so ask for comment

A not so well know feature of html format is the DATA uri scheme that
allow to embded some stuff like image in html file (see
https://en.wikipedia.org/wiki/Data_URI_scheme).

I am sure that base64 encoded stuff like image are not considered as
prefered form of modification, and I believe that lintian should
detect in source file this kind of use, in order to help ftpmaster
work.


They are also security implication and I think it is good to detect
this kind of stuff.

It is easy to implement:
- first move to files.pm privacy-breach logic detection to common
library (this one I need help)
- detect the base64 encoding in privacy-breach logic
- warn pedantically in files.pm for base64 and error in cruft.pm

Any comments ?

Bastien