Zdenko:

I have following use case for tesseract C ++ 4.1 APi
I would like to read multi-page non-searchable pdf file as an input parameter 
in PIX or PIXA, as output I would like to create searchable pdf file
my question to you
which tesseract C ++ Api Function I can call,
to read the multipage non-searchable pdf file in PIX or PIXA,
Do you have a little C ++ example about this topic
I mean, exactly like the command line: tesseract test.pdf output pdf
(test.pdf is multipage pdf file as input parameter)


Am Freitag, 25. Oktober 2019 16:35:14 UTC+2 schrieb Ivica Anic:
>
>     Hi,  
>      I am testing the Tesseract C++ API (4.1 Version).
>        Here is my code:
>
>       
>        char *datapath = "C:\\Temp\\tessdata-master";
> string language_ = "deu";
> string inputFile_ = "./input.png";
> tesseract::TessBaseAPI *api100 = new tesseract::TessBaseAPI();
> if (api100->Init(datapath, "deu", tesseract::OEM_LSTM_ONLY)) {
> fprintf(stderr, "Could not initialize tesseract.\n");
> exit(1);
> }
>
>
> api100->SetVariable("tessedit_create_pdf", "T");
>       //png File is input file
> PIX *sourceImg100 = pixRead(inputImage.c_str());
>
> api100->SetImage(sourceImg100);
>
>
> api100->Recognize(0);
>
> api100->SetPageSegMode(tesseract::PSM_AUTO_ONLY);
> api100->SetInputName(inputImage.c_str());
> tesseract::TessResultRenderer *renderer100 = new 
> tesseract::TessPDFRenderer("output_base", api100->GetDatapath(),false);
>
> renderer100->BeginDocument("test");
> renderer100->AddImage(api100);
> api100->ProcessPage(sourceImg100, 0, inputImage.c_str(), NULL, 5000, 
> renderer100);
> renderer100->EndDocument();
> api100->End();
> pixDestroy(&sourceImg100);
>     
>         how can I get a searchable PDF file output and save it on my 
> computer ?
>        I mean, exactly like the command line : tesseract test.tif output 
> pdf
>
>        Zdenko:
>        by my test one output pdf File is created,but pdf file is not 
> readable
>        if I try to open pdf File it is comming Error XREF-Data in pdf-file 
> are missing 
>       
>
>         
>
>       Thanks a lot
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f9fbb2d9-7224-4925-bad2-fa267f6cb96e%40googlegroups.com.

Reply via email to