Re: [racket-users] error with pdf-read / libpoppler on Windows

2018-03-23 Thread David Storrs
On Thu, Mar 22, 2018 at 6:45 PM, Neil Van Dyke  wrote:

> (Warnings: the PDF documentation is big, you'll appreciate why C and C++
> programmers have difficulty implementing PDF without filling it with
> vulnerabilities, and you'll also get to see some totalitarian-friendly
> features that have been added to PDF.)

Having done exactly this (although in Perl, not C/C++), I can attest
that parsing, modifying, and/or generating arbitrary PDFs is a bloody
nightmare.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] error with pdf-read / libpoppler on Windows

2018-03-23 Thread Joel Dueck
This point is well taken. For my package's core functionality I mainly need 
to be able to get a page count and the size of the first page. I would 
rather not have to try and write my own functions to do that; even though 
it looks (from the docs you provided) like the parsing wouldn't be too 
difficult, it's annoying to reinvent the wheel and potentially have to 
support code that may break in any number of edge cases. But if I have to, 
maybe I'll make a separate package.

Security isn't really an issue for my package, but I'm already thinking my 
users should probably not have to deal with DLL problems like this.

The command line tool is a good idea too but doesn't seem like a good 
option for my package, which, again, I'd like to be useable without hassle 
on any platform.

The only other loss was that for my scribblings I was hoping to use 
`pdf->pict` to demonstrate the results of my code samples, but I suppose I 
could just fake it with some PNGs...and try to remember to update them if I 
change anything.

On Thursday, March 22, 2018 at 5:45:27 PM UTC-5, Neil Van Dyke wrote:
>
> Side comment for the list, on software engineering security practice... 
> If what you have to do with the PDF is simple to do to the raw PDF 
> format, and your code might be fed PDF files of provenance that you 
> don't control, it's probably better security not to involve Poppler. 
>
>
> https://nvd.nist.gov/vuln/search/results?adv_search=false_type=basic_type=overview_type=all=poppler
>  
>
> https://www.adobe.com/devnet/pdf.html 
>
> In the past, I've found that pure Racket is quite capable of parsing PDF 
> sufficient to do what I've needed to do for consulting clients (e.g., 
> extracting editable forms data). 
>
> A compromise, if you don't want to implement anything in pure Racket, 
> but a command line tool can do what you need, is to call that command 
> line tool from Racket, keeping likely memory exploits from giving 
> you a hard-to-debug mess in your Racket VM memory space.  If you want to 
> be extra careful, you can use various host OS features to isolate that 
> host process. 
>
> (Warnings: the PDF documentation is big, you'll appreciate why C and C++ 
> programmers have difficulty implementing PDF without filling it with 
> vulnerabilities, and you'll also get to see some totalitarian-friendly 
> features that have been added to PDF.) 
>
> (Also, it's not just the apparent authors of the immediate PDF tool in 
> whom you have to have confidence -- some of the PDF tools include 
> ancient-ancient PS C code, as well as link a lot of other 
> known-problematic libraries, such as for 2D pixmaps and fonts.) 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] error with pdf-read / libpoppler on Windows

2018-03-22 Thread Neil Van Dyke
Side comment for the list, on software engineering security practice... 
If what you have to do with the PDF is simple to do to the raw PDF 
format, and your code might be fed PDF files of provenance that you 
don't control, it's probably better security not to involve Poppler.


https://nvd.nist.gov/vuln/search/results?adv_search=false_type=basic_type=overview_type=all=poppler

https://www.adobe.com/devnet/pdf.html

In the past, I've found that pure Racket is quite capable of parsing PDF 
sufficient to do what I've needed to do for consulting clients (e.g., 
extracting editable forms data).


A compromise, if you don't want to implement anything in pure Racket, 
but a command line tool can do what you need, is to call that command 
line tool from Racket, keeping likely memory exploits from giving 
you a hard-to-debug mess in your Racket VM memory space.  If you want to 
be extra careful, you can use various host OS features to isolate that 
host process.


(Warnings: the PDF documentation is big, you'll appreciate why C and C++ 
programmers have difficulty implementing PDF without filling it with 
vulnerabilities, and you'll also get to see some totalitarian-friendly 
features that have been added to PDF.)


(Also, it's not just the apparent authors of the immediate PDF tool in 
whom you have to have confidence -- some of the PDF tools include 
ancient-ancient PS C code, as well as link a lot of other 
known-problematic libraries, such as for 2D pixmaps and fonts.)


--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.