Aha! Worked like a charm -- thanks very much! Combining HanT+Japanese seems 
also to degrade recognition accuracy pretty significantly, but HanT from 
the best trained data works pretty well on its own for pages that are just 
English and Chinese.

On Thursday, August 31, 2017 at 4:25:06 AM UTC-4, shree wrote:
>
> Have you tried the best trained data for Chinese which has English in 
> addition to Chinese as part of the training. That maybe a better option 
> than using eng+
>
> On 31-Aug-2017 12:31 PM, "Brendan O'Kane" <[email protected] <javascript:>> 
> wrote:
>
>> Hi all,
>>
>> Running 'tesseract -l eng+chi_tra' on a scanned page of English text 
>> mixed with Chinese characters does not detect any Chinese characters at 
>> all: 
>>
>> > The five chapters on fiction, memoirs, and other kinds of prose that
>> > follow offer as many approaches to our understanding of the transition
>> > between 1644 and 1700. Focusing on the lives of Mao Xiang § X (161-
>> > 93) and Yu Huai A1% (1616-96), Oki Yasushi develops portraits of these
>> > two "romantic Jiangnan loyalists," who clung to patterns of late Ming
>> > feeling and aestheticism long after the Ming had fallen. The image of
>> > loyalism as romantic is in striking contrast to starker images of 
>> loyalist
>> > experience. Both Mao and Yu are best known for their memoirs, which
>> > focus prominently on women, one of the new ways of figuring nos-
>> > talgia and resistance in male writings of the early Qing. Robert Hegel's
>> > "Dreaming the Past" is similarly concerned with the individual, fo-
>> > cusing on Chu Renhuo #ARE (ca. 1630-1705+), as well as his novel,
>> > Sui Tang yany: G B® #&, (ca. 1675), but it extends well beyond Chu and
>> > his work in contemplating how "the past" (the Tang past in particular)
>> > shaped imaginative literature in an era when the present offered little
>> > solace.
>>
>>
>> The characters are (mostly) correctly recognized when only 'chi_tra' is 
>> set as the OCR language, but at the cost of seriously degraded accuracy in 
>> English OCR:
>>
>> > The fve chapters on fiction,menoirs, and other kinds of prose thar
>> > follow offer as nany approaches to our understanding ofthe transition
>> > between :644 and I7oo. Focusing on the |ives of Mao 文 iang 冒 裱 (I6II-
>> > 93andYuTiuai 余 懷 ((616-96), OkiYasushidevelops portraits ofthese
>> > two "ronantic Jiangnan loyalists"who clung to patterns of ]ate N{ing
>> > feeling and aestheticismn long after the Ming had fallen. The of
>> > loyalisn as ronantic is in striking contrast to starker 1nages of 
>> |oyalisr
>> > experience. Both Mao and Yu are best known fortheir memolrs, wˇhich
>> > focus Proninently on womnen, one of the new ways of figuring nox-
>> > talgia and resistance in male writings of the early Cuing. Roberr 
>> Tiegel's
>> > "1reaning the Past" is simnilarly concerned with the individual, fo-
>> > cusing on ChuRenhuo 褚 人 穫 (ca. I63o-I7oy+)}, as well a$ his novel,
>> > 5#77mzg5227 隋 唐 演 義 (Ca.I67y, butit extends well beyondChu and
>> > his work in contemplating how "the past" (the Tang Past in Particulan
>> > shaped imaginative ]iterature in an era when Lhe present offered |ittle
>> > $olace.
>>
>>
>> Is this a known issue? Am I doing something wrong here? 
>>
>> --Brendan
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/0ed8e7da-72cb-4bb8-8f48-44f8fc76f7c2%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/0ed8e7da-72cb-4bb8-8f48-44f8fc76f7c2%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/56818e08-bb6a-4065-bfc0-bc3f03132f24%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to