D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-15 Thread Stefan Brüns
bruns added a comment.


  In D16498#359676 , @poboiko wrote:
  
  > It seems like you've pushed something that was not intended to be pushed 
(XML extractor parts)
  
  
  No. Phabricator is just stupid.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-15 Thread Igor Poboiko
poboiko added a comment.


  It seems like you've pushed something related to XML extractor as well

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-14 Thread Stefan Brüns
This revision was automatically updated to reflect the committed changes.
Closed by commit R286:205ed84ee213: [KFileMetaData] Add extractor for DSC 
conforming (Encapsulated) Postscript (authored by bruns).

CHANGED PRIOR TO COMMIT
  https://phabricator.kde.org/D16498?vs=44408=45492#toc

REPOSITORY
  R286 KFileMetaData

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D16498?vs=44408=45492

REVISION DETAIL
  https://phabricator.kde.org/D16498

AFFECTED FILES
  autotests/CMakeLists.txt
  autotests/postscriptdscextractortest.cpp
  autotests/postscriptdscextractortest.h
  autotests/samplefiles/test.eps
  autotests/samplefiles/test.ps
  src/extractors/CMakeLists.txt
  src/extractors/postscriptdscextractor.cpp
  src/extractors/postscriptdscextractor.h

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-14 Thread Igor Poboiko
poboiko accepted this revision.
poboiko added a comment.
This revision is now accepted and ready to land.


  Apart from trivial comment, this looks fine. I've tested it on my setup (with 
bunch of (e)ps files), and randomly chosen files seems to be indexed nicely. It 
also reduced the size of the index by almost 50MB, because those are not 
indexed as plaintext anymore :)
  Yet I would also vote for replacing it (eventually) with a full-featured 
extractor based on libspectre
  (I'm not a security specialist in any way, but that CVE doesn't look too 
harmful, and from my point of view it's not worth to abandon full support of 
(E)PS because of it)

INLINE COMMENTS

> postscriptdscextractortest.cpp:27
> +#include 
> +#include 
> +

This, and QDebug, seems to be not used anymore.

REPOSITORY
  R286 KFileMetaData

BRANCH
  postscript_dsc

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-08 Thread Stefan Brüns
bruns removed a reviewer: pino.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-08 Thread Stefan Brüns
bruns requested review of this revision.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-08 Thread Stefan Brüns
bruns added a comment.


  @pino - you have not answered for a week.
  
  You have set this to "Needs Revision", which removes it from the "Needs 
Review" queue for everyone else.
  
  If you wan't to see an extractor based on libspectre, thats fine, but then 
**you** have to write it.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-01 Thread Stefan Brüns
bruns added a comment.


  In D16498#352323 , @pino wrote:
  
  > In D16498#352314 , @bruns wrote:
  >
  > > Please answer why you consider running a full blown postscript 
interpreter in an uncontrolled environment (no sandboxing, runs without user 
interaction) is better than 20 code lines of trivial text parsing.
  >
  >
  > Please be patient, I've got other things that currently take my time, and I 
need to properly get some data to satisfty your questions.
  
  
  Thats completely fine for me, if you need ore time, just say so.
  
  > In the meanwhile, two suggestions:
  > 
  > - please try to see also other people's point of point, and not just yours; 
if I suggested something, then it's because, according to my 
knowledge/experience (which includes also maintaining okular in the past, FYI), 
I deem it the optimal way
  > - when asking for people's opinion, again, try to use a more friendly 
attitude; that will certainly help getting my attention faster, rather than 
feeling attacked because I'm expressing //technical// doubts on this solution
  
  Not answering can also be considered impolite. I am trying to solve a real 
problem here. Postscript files currently pose a significant problem for baloo. 
I listed //technical// reasons why libspectre/gs is **not** a good solution.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-01 Thread Pino Toscano
pino added a comment.


  In D16498#352314 , @bruns wrote:
  
  > Please answer why you consider running a full blown postscript interpreter 
in an uncontrolled environment (no sandboxing, runs without user interaction) 
is better than 20 code lines of trivial text parsing.
  
  
  Please be patient, I've got other things that currently take my time, and I 
need to properly get some data to satisfty your questions.
  
  In the meanwhile, two suggestions:
  
  - please try to see also other people's point of point, and not just yours; 
if I suggested something, then it's because, according to my 
knowledge/experience (which includes also maintaining okular in the past, FYI), 
I deem it the optimal way
  - when asking for people's opinion, again, try to use a more friendly 
attitude; that will certainly help getting my attention faster, rather than 
feeling attacked because I'm expressing //technical// doubts on this solution

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-11-01 Thread Stefan Brüns
bruns added a comment.


  @pino
  
  Please answer why you consider running a full blown postscript interpreter in 
an uncontrolled environment (no sandboxing, runs without user interaction) is 
better than 20 code lines of trivial text parsing.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Stefan Brüns
bruns added a comment.


  In D16498#350460 , @pino wrote:
  
  > In D16498#350426 , @bruns wrote:
  >
  > > In D16498#350422 , @pino wrote:
  > >
  > > > In D16498#350289 , @bruns 
wrote:
  > > >
  > > > > In D16498#350286 , @pino 
wrote:
  > > > >
  > > > > > Ugh no manual parsing of PS files -- please use libspectre.
  > > > >
  > > > >
  > > > > This is not Postscript parsing, but DSC parsing - read the 
specification to understand the difference!
  > > > >
  > > > > http://www.lprng.com/RESOURCES/ADOBE/5001.DSC_Spec.pdf
  > > >
  > > >
  > > > An EPS file //is// also a PostScript file, and indeed ghostscript opens 
it perfectly
  > >
  > >
  > >
  > >
  > > - also **
  >
  >
  > Sure: that is a reason more to handle it like that.
  >
  > >> Because of the above, libspectre perfectly handles EPS files, and the 
API already provides all the information that the current DscExtractor provides 
as well.
  > > 
  > > Its an additional dependency (libpectre and libgs),
  >
  >
  >
  > - libspectre is a C library, and uses only libgs
  > - the rest of the ghostscript dependencies are already used in one way or 
another in an average KDE installation
  > - okular & cantor already use libspectre for a very long time (okular a 
decade)
  
  
  It //may// be an option to use libspectre as basis for an additional 
extractor, but this should not be the default, see below.
  
  This is meant to be as simple as possible - the extractor itself is hardly 
more than 30 lines of code. There is definitely a use case for this extractor.
  
  >> and also imposes a security risk - see e.g. 
https://nvd.nist.gov/vuln/detail/CVE-2018-11645
  > 
  > There are way worse CVEs in lower components of a modern Linux stack (say 
in the Linux kernel).
  >  Also, according to that, should we drop the support in okular for 
PostScript files, and the support in cantor for EPS images?
  
  There is a difference between opening a file consciously and letting it 
happen by chance. The extractor is run when a file is hovered by the mouse 
cursor or by baloo. It will be executed without the user being aware of it.
  
  IMHO the ghostscript support should be disabled (runtime) by default in 
Okular, until it is run completely sandboxed.
  
  >>> In D16498#350325 , @bruns 
wrote:
  >>> 
   @pino - please remove your change request, if you have not read the code 
at all ...
  >>> 
  >>> 
  >>> Please tone down your attitude to something more respectful, thanks.
  >> 
  >> You started with "Ugh ..." - you comment lacks any respect ...
  > 
  > Certainly this was not the case, sorry if it was not intended. But even 
then, that was geared towards //the code//, and that does not remotely justify 
attacking //the person// with "you did not read the code" (which is wrong).
  
  The code clearly states it targets (E)PS **DSC**. A full blown PS interpreter 
//may// be able to extract more info from the file, but not without the 
mentioned drawbacks. Blankly stating using libspectre is better and should be 
used, without weighting pros and cons, does not give the impression (//to me//) 
you have evaluated it carefully.
  
  Apology accepted.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Pino Toscano
pino added a comment.


  In D16498#350426 , @bruns wrote:
  
  > In D16498#350422 , @pino wrote:
  >
  > > In D16498#350289 , @bruns 
wrote:
  > >
  > > > In D16498#350286 , @pino 
wrote:
  > > >
  > > > > Ugh no manual parsing of PS files -- please use libspectre.
  > > >
  > > >
  > > > This is not Postscript parsing, but DSC parsing - read the 
specification to understand the difference!
  > > >
  > > > http://www.lprng.com/RESOURCES/ADOBE/5001.DSC_Spec.pdf
  > >
  > >
  > > An EPS file //is// also a PostScript file, and indeed ghostscript opens 
it perfectly
  >
  >
  >
  >
  > - also **
  
  
  Sure: that is a reason more to handle it like that.
  
  >> Because of the above, libspectre perfectly handles EPS files, and the API 
already provides all the information that the current DscExtractor provides as 
well.
  > 
  > Its an additional dependency (libpectre and libgs),
  
  
  
  - libspectre is a C library, and uses only libgs
  - the rest of the ghostscript dependencies are already used in one way or 
another in an average KDE installation
  - okular & cantor already use libspectre for a very long time (okular a 
decade)
  
  > and also imposes a security risk - see e.g. 
https://nvd.nist.gov/vuln/detail/CVE-2018-11645
  
  There are way worse CVEs in lower components of a modern Linux stack (say in 
the Linux kernel).
  Also, according to that, should we drop the support in okular for PostScript 
files, and the support in cantor for EPS images?
  
  >> In D16498#350325 , @bruns wrote:
  >> 
  >>> @pino - please remove your change request, if you have not read the code 
at all ...
  >> 
  >> 
  >> Please tone down your attitude to something more respectful, thanks.
  > 
  > You started with "Ugh ..." - you comment lacks any respect ...
  
  Certainly this was not the case, sorry if it was not intended. But even then, 
that was geared towards //the code//, and that does not remotely justify 
attacking //the person// with "you did not read the code" (which is wrong).

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Stefan Brüns
bruns added a comment.


  In D16498#350422 , @pino wrote:
  
  > In D16498#350289 , @bruns wrote:
  >
  > > In D16498#350286 , @pino wrote:
  > >
  > > > Ugh no manual parsing of PS files -- please use libspectre.
  > >
  > >
  > > This is not Postscript parsing, but DSC parsing - read the specification 
to understand the difference!
  > >
  > > http://www.lprng.com/RESOURCES/ADOBE/5001.DSC_Spec.pdf
  >
  >
  > An EPS file //is// also a PostScript file, and indeed ghostscript opens it 
perfectly
  
  
  
  
  - also **
  
  > Because of the above, libspectre perfectly handles EPS files, and the API 
already provides all the information that the current DscExtractor provides as 
well.
  
  Its an additional dependency (libpectre and libgs), and also imposes a 
security risk - see e.g. https://nvd.nist.gov/vuln/detail/CVE-2018-11645
  
  > In D16498#350325 , @bruns wrote:
  > 
  >> @pino - please remove your change request, if you have not read the code 
at all ...
  > 
  > 
  > Please tone down your attitude to something more respectful, thanks.
  
  You started with "Ugh ..." - you comment lacks any respect ...

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Pino Toscano
pino added a comment.


  In D16498#350289 , @bruns wrote:
  
  > In D16498#350286 , @pino wrote:
  >
  > > Ugh no manual parsing of PS files -- please use libspectre.
  >
  >
  > This is not Postscript parsing, but DSC parsing - read the specification to 
understand the difference!
  >
  > http://www.lprng.com/RESOURCES/ADOBE/5001.DSC_Spec.pdf
  
  
  An EPS file //is// also a PostScript file, and indeed ghostscript opens it 
perfectly fine.
  Because of the above, libspectre perfectly handles EPS files, and the API 
already provides all the information that the current DscExtractor provides as 
well.
  
  In D16498#350325 , @bruns wrote:
  
  > @pino - please remove your change request, if you have not read the code at 
all ...
  
  
  Please tone down your attitude to something more respectful, thanks.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Stefan Brüns
bruns added a comment.


  @pino - please remove your change request, if you have not read the code at 
all ...

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Stefan Brüns
bruns added a comment.


  In D16498#350286 , @pino wrote:
  
  > Ugh no manual parsing of PS files -- please use libspectre.
  
  
  This is not Postscript parsing, but DSC parsing - read the specification to 
understand the difference!

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Pino Toscano
pino requested changes to this revision.
pino added a comment.
This revision now requires changes to proceed.


  Ugh no manual parsing of PS files -- please use libspectre.

REPOSITORY
  R286 KFileMetaData

REVISION DETAIL
  https://phabricator.kde.org/D16498

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko, pino
Cc: pino, kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, 
ngraham, bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-29 Thread Stefan Brüns
bruns updated this revision to Diff 44408.
bruns added a comment.


  Replace QDir::Separator with "/"

REPOSITORY
  R286 KFileMetaData

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D16498?vs=44388=44408

BRANCH
  postscript_dsc

REVISION DETAIL
  https://phabricator.kde.org/D16498

AFFECTED FILES
  autotests/CMakeLists.txt
  autotests/postscriptdscextractortest.cpp
  autotests/postscriptdscextractortest.h
  autotests/samplefiles/test.eps
  autotests/samplefiles/test.ps
  src/extractors/CMakeLists.txt
  src/extractors/postscriptdscextractor.cpp
  src/extractors/postscriptdscextractor.h

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, ngraham, 
bruns, abrahams


D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript

2018-10-28 Thread Stefan Brüns
bruns created this revision.
bruns added reviewers: Frameworks, Baloo, astippich, ngraham, poboiko.
Herald added projects: Frameworks, Baloo.
Herald added a subscriber: kde-frameworks-devel.
bruns requested review of this revision.

REVISION SUMMARY
  Postscript files currently fall back to the plaintext extractor due to
  mimetype inheritance. This adds lots of garbage to the index and misses
  any useful data in the DSC comments.

TEST PLAN
  make && ctest

REPOSITORY
  R286 KFileMetaData

BRANCH
  postscript_dsc

REVISION DETAIL
  https://phabricator.kde.org/D16498

AFFECTED FILES
  autotests/CMakeLists.txt
  autotests/postscriptdscextractortest.cpp
  autotests/postscriptdscextractortest.h
  autotests/samplefiles/test.eps
  autotests/samplefiles/test.ps
  src/extractors/CMakeLists.txt
  src/extractors/postscriptdscextractor.cpp
  src/extractors/postscriptdscextractor.h

To: bruns, #frameworks, #baloo, astippich, ngraham, poboiko
Cc: kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, ngraham, 
bruns, abrahams