Re: Inverse English an digits in Arabic Text

2020-09-08 Thread adeq8
Thank you for support, I upload PDF file page by page. And in this case left to right (LTR) or right to left (RTL) reading apples for the whole document not for the specific text block ( separate for Arabic, separate for Enlish) I can see the same behavior with output for via  /select as well

Re: Inverse English an digits in Arabic Text

2020-09-08 Thread Alexandre Rafalovitch
If you are uploading a PDF, then you must be doing it via Tika or via an extract handler (which uses Tika under the covers). Try getting a standalone Tika of the same version and see what it outputs. Perhaps there is something in those specific PDF pages that confuse Tika. Like, if it used

Re: Inverse English an digits in Arabic Text

2020-09-07 Thread Erick Erickson
A quick test would be to send some simple queries by curl rather than the browser, that’ll avoid any rendering issues. Second, take a look at the admin UI>>pick_a_collection_from_the_dropdown>>analysis page and look at the terms in the field in question. Do they look “ok”?l You’re looking at

Re: Inverse English an digits in Arabic Text

2020-09-07 Thread Alexandre Rafalovitch
> Doc in Arabic with some English - English text is inverted (for example, "gro.echapa.www"), what makes search by key words impossible. What very specifically do you mean by that. How do you see the inversion? If that's within some sort of web ui, then you are probably seeing the HTML bidi

Inverse English an digits in Arabic Text

2020-09-07 Thread adeq8
Hi, Could please help to resolve an issue. I upload/index several documents in English and in Arabic languages to SOLR, in addition I use handler for Arabic language: