Still trying to understand how I get the settings that have been set on a parseContext. In other words, let’s say that I just have a parseContext. I have no idea what configs have been added to it. Is there a way to extract the parsers or the configs from the parseContext and view the settings? I can use the settings that I *think* I passed into it, but I would rather get the settings from the parseContext itself, to ensure that they are what I think they are.
From: Peter Kronenberg <[email protected]> Sent: Wednesday, February 10, 2021 10:12 AM To: [email protected] Subject: {EXTERNAL}New config paradigm This email was sent from outside your organisation, yet is displaying the name of someone from your organisation. This often happens in phishing attempts. Please only interact with this email if you know its source and that the content is safe. CAUTION: This email originated from outside of the organization. DO NOT click links or open attachments unless you recognize the sender and know the content is safe. Ok, I’m gonna have questions 😊 In this code, I assume that this extracts the settings that are in the tika-config. And we have to extract one parser at a time, right? try (InputStream is = TikaOCRParser.class.getResourceAsStream("/tika-config.xml")) { tikaConfig = new TikaConfig(is); } Parser pdfParser = findParser(tikaConfig.getParser(), org.apache.tika.parser.pdf.PDFParser.class); PDFParserConfig pdfParserConfig = ((PDFParser)pdfParser).getPDFParserConfig(); System.out.println("OCR Strategy: " + pdfParserConfig.getOcrStrategy()); If I then proceed to do this final PDFParserConfig pdfConfig = new PDFParserConfig(); pdfConfig.setOcrStrategy(PDFParserConfig.OCR_STRATEGY.AUTO); final AutoDetectParser parser = new AutoDetectParser(tikaConfig); final ParseContext parseContext = new ParseContext(); parseContext.set(AutoDetectParser.class, parser); parseContext.set(PDFParserConfig.class, pdfConfig); How do I now get the values that are being used in the composite parseContext? I want to confirm that the values are as expected
