On Mon, 19 Sep 2016, Bob Paulin wrote:
I think it's a good thing to discuss. I know there are other features that are targeted for 2.0. Do we have a general sense of where those features are at?


I think the big one we need to crack is allowing multiple parsers to run against a file. OCR is probably the most critical of these from the modularisation perspective, with all those nasty interlinkings between the parsers to allow the manual delegation. If we can crack the problem of multiple parsers, those proxy issues should go away (or at least get better!)

As a bonus, it ought to also improve things for error cases (fallback parsers etc), but for your needs, the simplification for "ocr + image metadata" is likely your biggest win!

(I think it might also let us tidy up some of the enhancement parsers too, like how the NLP stuff fits into the parsing framework)

Nick

Reply via email to