Re: [qubes-devel] Re: Refactoring PDF Converter (and other scripts)

Demi M. Obenour Thu, 02 Apr 2020 19:00:04 -0700

On 2020-04-02 21:46, Marek Marczykowski-Górecki wrote:>> Marek: is OCR on a 
converted PDF safe? Being able to reconstruct the
>> text is very much useful. 
> 
> That's a tricky question. qpdf-convert-server have significant control
> over input for such OCR (within realm of valid image data). So, given
> complexity of OCR software, I think nothing can be completely ruled out. 
> But also, I think (because of guaranteed proper input format) some
> catastrophic failure is unlikely.
> 
> In fact, I consider another method for preserving text data. Enhanced
> "simpler representation", which besides pure image, contains also text
> annotations. Thing like series of (coordinates, text) pairs. This needs
> careful design, to be reasonably safe (for example defining what "text"
> could contain, to not risk re-interpreting it as something else in the
> PDF, or some intermediate tool).


That would be absolutely awesome.  The biggest problem with
qvm-convert-pdf is the loss of text, and keeping the text would make
it far more usable.
 
>> Also, could this be integrated into CUPS?
> 
> I don't see why not.

Given how insecure printers are, this would be a very good idea.
Perhaps similar technology (possibly based on something like seL4
instead of Xen) could be incorporated into printers themselves.

Sincerely,

Demi

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/c9c61ca0-3d6f-6694-39de-9382ea27f98f%40gmail.com.

signature.asc
Description: OpenPGP digital signature

Re: [qubes-devel] Re: Refactoring PDF Converter (and other scripts)

Reply via email to