I am not sure what OS you use, but AFAIK ImageMagick should use internally
ghostscript.
After several testing (in other project) I found this command (windows
version, for other OS you need to find correct name of ghostscript
executable) for converting pdf to tiff:

gswin64c.exe -dBATCH -dTextAlphaBits=4 -dGraphicsAlphaBits=4
-dNOPAUSE -r300 -sDEVICE=tiffgray  -sOutputFile=output.tif input.pdf

Note: I prefer to use tiffgray instead of tiffg4, because tiffg4  output is
usually ugly (you can get much better result if you convert color image to
gray first and in text step to g4/binary color).

Other option is to use poppler (quite paint to make it working on windows,
but no problem on linux) - there is utility pdftoppm, that can produce jpg,
png or tiff output, decrease color space (gray, mono), specify tiff
compression (none, packbits, jpeg, lzw, deflate)...

In my option these are only working opensource free multiplatform solutions
with reasonable options.

Zdenko


ne 5. 5. 2019 o 16:46 fady taher <[email protected]> napísal(a):

> *the current tool am using is image magic I tried to convert the PDF to
> Image using another tool, and it seems the result did come out correct*
>
> On Sunday, May 5, 2019 at 4:19:15 PM UTC+2, shree wrote:
>>
>> Problem seems to be with with the jpg image that you are using.
>>
>> Get correct results when using the pdf file with gimagereader.
>>
>> https://www.illinoiscapacitor.com/pdf/generated/lytics_products_detail_11040.pdf
>>
>>
>> Frequency Multipliers:
>>
>>
>> 50 HZ 120 HZ 400 HZ 1 KHZ 10 KHZ 100 KHZ
>>
>>
>> 0.9 1 1 1.15 (1/125 1.25
>>
>>
>> PHYSICAL DIMENSIONS
>>
>>
>> Diameter (D): 22 mm + 1mm
>>
>>
>> Length (L): 25 mm + 2 mm
>>
>>
>> Lead Spacing (S): 10 mm +/- 0.1 mm
>> Coating: BrownPET sleeving
>>
>> On Sun, May 5, 2019 at 7:33 PM fady taher <[email protected]> wrote:
>>
>>> *I used  option --fontlist "Calibri"  and --max_iterations 3600*
>>>
>>>
>>> On Sunday, May 5, 2019 at 4:02:05 PM UTC+2, shree wrote:
>>>>
>>>> Which font did you use? Hopefully it was similar to your image. How
>>>> many iterations?
>>>>
>>>> On Sun, May 5, 2019 at 6:58 PM fady taher <[email protected]> wrote:
>>>>
>>>>> *I followed the instructions*
>>>>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00.md#fine-tuning-for-impact
>>>>>  ,
>>>>> *I added (S) for about 17 times in* *eng.training_text (attached)*
>>>>>
>>>>> On Sunday, May 5, 2019 at 3:17:55 PM UTC+2, shree wrote:
>>>>>>
>>>>>> Share an image for testing.
>>>>>>
>>>>>> How did you try to finetune?
>>>>>>
>>>>>>
>>>>>> On Sunday, May 5, 2019 at 5:40:39 PM UTC+5:30, fady taher wrote:
>>>>>>>
>>>>>>> *I do have numbers but this character "S" is pretty obvious, yet I
>>>>>>> think it keeps recognizing it with wrong value "5" due to the
>>>>>>> parentheses"(" and ")"*
>>>>>>>
>>>>>>> On Tuesday, April 30, 2019 at 5:32:14 AM UTC+2, Jonathan Muller
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> If you know you won't have numbers, what worked for me is
>>>>>>>> blacklisting numbers. Otherwise you will have to improve the image 
>>>>>>>> quality
>>>>>>>> (like resizing to bigger size and sharping the edges)
>>>>>>>>
>>>>>>>> On Mon, 29 Apr 2019 at 12:01, fady taher <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> the model keeps outputting (5) instead of (S), I tried to go with
>>>>>>>>> finetune, but it seems the process messed up the whole model ... how 
>>>>>>>>> can I
>>>>>>>>> increase the model accuracy
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>>>>> To view this discussion on the web visit
>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/4687f9cb-ebc9-443d-bdbb-e9ba50f8014c%40googlegroups.com
>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4687f9cb-ebc9-443d-bdbb-e9ba50f8014c%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jonathan
>>>>>>>> 06.49.32.74.55
>>>>>>>>
>>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/dc892940-88f5-4d13-af4f-2cb0e971f483%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/dc892940-88f5-4d13-af4f-2cb0e971f483%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ____________________________________________________________
>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/bb229fdf-ef4f-4e2e-9f0d-c6fc1643de01%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/bb229fdf-ef4f-4e2e-9f0d-c6fc1643de01%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/2f0182fb-6c35-431f-a664-944c326b72b9%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/2f0182fb-6c35-431f-a664-944c326b72b9%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xZ%2BvFQwVSUb%3D5G4WAKa7kZ_RyWXtH3VnAJ-5fp1eKynA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to