I think it would need to operate at RIL_SYMBOL level, not RIL_TEXTLINE.

On Wednesday, April 21, 2021 at 7:17:04 AM UTC-5 [email protected] wrote:

> Hi, when I pass tessedit_create_boxfile 1 argument to tesseract it outputs 
> individual chars' location. But when I use api like this:
>
> ```
> Boxa* boxes = api->GetComponentImages(tesseract::RIL_TEXTLINE, true,NULL,
> NULL);
> for(int i = 0; i < boxes->n; i++){
> BOX* box =boxaGetBox(boxes,i,L_CLONE);
> api->SetRectangle(box->x,box->y,box->w,box->h);
> char* outText = api->GetUTF8Text();
> int conf = api->MeanTextConf();
> fprintf(stdout,"Box[%d]: x=%d, y=%d, w=%d, h=%d, confidence: %d, text: %s"
> ,
> i, box->x, box->y, box->w, box->h, conf, outText);
> boxDestroy(&box);
> delete[] outText;
> }
> ```
> it outputs whole line like this:
> Box[1]: x=36, y=84, w=246, h=14, confidence: 44, text: #Spor #siyaset 
> Fanket FIliskiler
>
> Is there any way to combine individual boxes to print like API? Thanks in 
> advance.
>
>
>
>
>
>
> ############
> ### Environment
>
> * **Tesseract Version**: <!-- compulsory. you must provide your version -->
> tesseract 4.1.1-rc2-25-g9707
>  leptonica-1.78.0
>   libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : 
> libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
>  Found AVX2
>  Found AVX
>  Found FMA
>  Found SSE
>  Found libarchive 3.3.3 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 
> liblz4/1.8.3 libzstd/1.3.8
>
> * **Platform**: <!-- either `uname -a` output, or if Windows, version and 
> 32-bit or 64-bit -->
> Linux pardus 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 
> GNU/Linux
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a20ef4b7-9f76-4f20-a867-5d6f60fc6c62n%40googlegroups.com.

Reply via email to