Do you have ICU4J jar in your classpath in both situations? On Mon, Oct 31, 2011 at 1:35 PM, ahmad ajiloo <[email protected]> wrote: > Hello > When I use Tika for extracting my persian pdf files, all the characters will > be extracted vice versa. I mean that the characters showed from beginning of > the line to the end, but from left to right. However when I use Tika gui via > Nutch there is no mistake and the output text is right-to-left !! > > Following text is the first line of attached file in first mode (running > Tika independently): > ﻲﻠﻋ ﺎﻳ ﻮﺗ ﻝﻼﺟ ﺯﺍ ﻢﻧﺯ ﻡﺩ ﻪﻜﻧﺁ ﺕﺭﺪﻗ ﺖﺳﺍﺮﻣ ﻪﻧ ﻲﻣﺮﻜﻣ ﺩﻮﺟ ﺩﻮﺟﻭ ﻪﺑ ﺖﻤﻳﻮﮔ ﻪﻛ ﺖﺳﺍ > ﺲﺑ ﻦﻴﻤﻫ ﻪﻧ ﻱﺪﺑﻮﻣ ﺖﺨﺗ ﻪﺑ ﻱﺍ ﻩﺩﺯ ﺖﻨﻄﻠﺳ ﻪﻴﻜﺗ ﻪﻜﻧﺁ ﻲﺋﻮﺗ > > and this is in second mode (running Tika gui via Nutch) and this is a clear > persian text: > نه مراست قدرت آنكه دم زنم از جلال تو يا علي نه همين بس است كه گويمت به > وجود جود مكرمي توئي آنكه تكيه سلطنت زده اي به تخت موبدي > > Thanks for your attention > > > > >
-- lucidimagination.com
