D10455: Add RTL support for search, copy & paste in pdf

2022-03-25 Thread Guy Sheffer
guysoft added a comment.


  Hey,
  So I sent a merge request for this, but the issue is now that clazy 
formatting in the ci/cd of the build requires these too lines be changed to 
something more readable:
  
// output a left-to-right section
for (j = i; j < origTxtOrder.length() && ! 
origTxtOrder.at(j)->text().isRightToLeft(); ++j) ;
  
  and
  
for(j = i; j < origTxtOrder.length() && 
(origTxtOrder.at(j)->text().isRightToLeft() || origTxtOrder.at(j)->text() == 
QLatin1String(" ")); ++j);
  
  I can't test because when I run this on my ubuntu I get that backends are 
missing. So I can't really change this stuff.
  
  Can anyone help me out to turn this in to a code block?
  @fahadalsaidi perhaps? Its pretty close to get merged.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: sander, guysoft, davidhurka, yaron, okular-devel, chfanzil, ngraham, 
anaumann, johnzh, andisa, siddharthmanthan, maguirre, fbampaloukas, joaonetto, 
kezik, darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2022-03-22 Thread Guy Sheffer
guysoft added a comment.


  Done, but I can't test it:
  https://invent.kde.org/graphics/okular/-/merge_requests/595

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: sander, guysoft, davidhurka, yaron, okular-devel, chfanzil, ngraham, 
anaumann, johnzh, andisa, siddharthmanthan, maguirre, fbampaloukas, joaonetto, 
kezik, darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2022-03-17 Thread Oliver Sander
sander added a comment.


  Step 1 would be to open a merge request at 
https://invent.kde.org/graphics/okular/-/merge_requests .  Can you do that?

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: sander, guysoft, davidhurka, yaron, okular-devel, chfanzil, ngraham, 
anaumann, johnzh, andisa, siddharthmanthan, maguirre, fbampaloukas, joaonetto, 
kezik, darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2022-03-17 Thread Guy Sheffer
guysoft added a comment.


  Hey, Any progress on this?
  What needs to change in core/textpage.cpp in order for this to work correctly?

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: guysoft, davidhurka, yaron, okular-devel, chfanzil, ngraham, anaumann, 
johnzh, andisa, siddharthmanthan, maguirre, fbampaloukas, joaonetto, kezik, 
darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2020-12-28 Thread David Hurka
davidhurka added a comment.


  I read the reported test results as if this doesn’t work correctly. However, 
without this patch it doesn’t work at all, so that may already be an argument 
to accept this patch.
  
  Text handling and search is IMHO broken in general, and needs to be redone. 
The current approach reorders all TextEntities, to give a nice 
left-to-right-top-to-bottom order. That is nice in general, but I think it 
should be done only on paragraph level, not on character level.
  
  What needs to happen next for this patch, is probably to rebase it on master, 
and submit it on invent.kde.org. Then we can see together how to fix the 
mentioned memory leaks.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: davidhurka, yaron, okular-devel, chfanzil, ngraham, johnzh, andisa, 
siddharthmanthan, maguirre, fbampaloukas, joaonetto, kezik, darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2020-12-27 Thread Yaron Shahrabani
yaron added a comment.


  I'm still experiencing this bug.
  
  For example:
  If I want to find the word Shalom I need to search for molahS in order to 
find a match.
  
  Anything I can help with regarding this patch?

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: yaron, okular-devel, chfanzil, ngraham, johnzh, andisa, siddharthmanthan, 
maguirre, fbampaloukas, joaonetto, kezik, darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2020-02-21 Thread Albert Astals Cid
aacid requested changes to this revision.
aacid added a comment.
This revision now requires changes to proceed.


  Please move as a Merge Request in https://invent.kde.org/kde/okular
  
  We have pre-commit CI and lots of checks including clazy and clang-tidy there 
so it's a much better place for doing the review/approval/merge of the code.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: okular-devel, chfanzil, ngraham, johnzh, andisa, siddharthmanthan, 
maguirre, fbampaloukas, joaonetto, kezik, tfella, darcyshen, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-05-16 Thread Fahad Al-Saidi
fahadalsaidi added a comment.


  In D10455#263014 , @ngraham wrote:
  
  > @fahadalsaidi, are you able to work on those?
  
  
  Sorry, I don't have  knowledge & time to fix the memory leaks. For the other 
question , I've already answered, here is it:
  
  >   The other back-ends have many problem not related to this patch as 
following:
  > 
  > - xps: okular crash when open arabic xps file and if it can open it, it 
renders it in wrong way.
  > - dejuv: searching & copying Arabic text are fine since dejuv back-end has 
it own search function.
  > - odt: there is broken in LTR or RTL. There is no text to search at all.
  > - chm: Although okular can open it but searching is bad for LTR & RTL text, 
it is basically broken.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: okular-devel, chfanzil, ngraham, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-05-15 Thread Nathaniel Graham
ngraham added a comment.


  @fahadalsaidi, are you able to work on those?

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: okular-devel, chfanzil, ngraham, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-05-14 Thread Albert Astals Cid
aacid added a comment.


  In D10455#262444 , @ngraham wrote:
  
  > What's our path forward here?
  
  
  Someone should address the memory leaks and someone should address the 
questions about the impact of this to other formats other than PDF.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: okular-devel, chfanzil, ngraham, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-05-14 Thread Nathaniel Graham
ngraham added a comment.
Restricted Application added a subscriber: okular-devel.


  What's our path forward here?

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: okular-devel, chfanzil, ngraham, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-03-17 Thread Fahad Al-Saidi
fahadalsaidi added a comment.


  @chfanzil , thanks for testing and sharing your findings. The patch doesn't 
touch search logic , it deals only with inner text layer in okular.
  
  @aacid  sorry to get to you late but Unfortunately I couldn't figure out how 
to fix the leak, my knowledge in C++ & programming is limited. I am just trying 
here. In other hand why not make ouklar deal with poppler directly to search & 
get text, the same way it does with dejuv backend?  I know it is a lot of work 
but Here I am just asking.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: chfanzil, ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-25 Thread Chfan Zil
chfanzil added a comment.


  I've tested this patch using 2 PDF files: One in Hebrew (it was downloaded 
using Wikipedia's Download-as-PDF option) the other in Arabic (it was supplied 
with the patch).
  F5729640: Open Source (Hebrew Wikipedia).pdf 

  
  F5729641: arabic-search-test.pdf 
  
  The results are as follows:
  
  1. Okular was able to find the text I was searching for (Success).
  
  2. But it is looking for the text in random areas. This is caused, probably, 
because Okular perceives each bulk of few words as located in different areas. 
And not as it should:
  
  First line: first word to the right, second word to the right, third...
  Second line: first word to the right, second word to the right, third...
  Third line...
  ..
  ..
  Last line: first word to the right, second word to the right, third..., last 
word to the right.
  
  I'm attaching a gif to illustrate this (I'm looking for the first word in the 
text, a 3-letter word 'קוד' (which translates to code):
  F5729642: Search RTL text in patched okular D10455 - 1.gif 

  
  3. Okular is unable to copy RTL text correctly. The order of the words gets 
mixed. This can be seen when trying to select the text - the way in which the 
text gets selected isn't intuitive, and can also be seen when pasting the 
selected text - in comparison to the original text. Here is a gif to illustrate 
this:
  
  F5729644: Copy RTL text in patched okular D10455.gif 

  
  4. The same problems occur when testing with the PDF in Arabic. Selecting the 
text is messy and searching for a letter in the text jumps between lines up and 
down.
  
  Here is a gif to illustrate this (I'm looking for a common letter from the 
Arabic alphabet, the letter 'ا' (Alif, the first letter in the Arabic 
alphabet)):
  F5729646: Search RTL text in patched okular D10455 - 2.gif 

  
  To conclude:
  In regards to searching:
  **unpatched okular**: You need to type the text in reverse in order to search 
it.
  **okular + D10455 **: You can partially 
search by typing the text in the correct way, but the results will be scattered 
and not in the intuitive order of reading.
  **okualr + D10298 **: You can partially 
search by typing the text in the correct way, but the results will come up for 
each line, in the opposite order, but at least it will jump from the first 
line, to the second, to the third, and not sporadically like D10455 
.
  
  In regards to copying text:
  **unpatched okular**: Will copy the text in reverse (not just the order of 
the words, but also the order of the letters in a word), so ABC DEFG HIJK will 
paste as KJIH GFED CBA.
  **okular + D10455 **: Will copy parts of 
the text in the correct way. 
  So for example: ABC DEFG HIJK LMNO PQR STUVW XYZ 
  Will be pasted as: HIJK LMNO ABC PQR STUVW DEFG XYZ
  **okualr + D10298 **: Like unpatched 
okular
  
  In terms of usability, the ability to search is important, so D10298 
 is better than D10455 
 (as it enables one to search in each line 
from the end of the line to the beginning of the line, and not sporadically.
  In regards to copying text, they're both not so good.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: chfanzil, ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-23 Thread Nathaniel Graham
ngraham added a comment.


  @fahadalsaidi, I recently migrated an older patch of yours from Reviewboard 
to Phabricator (D10298 ), but I just closed 
it in favor of this one. If D10298  is 
still relevant even with this patch, please commandeer and re-open it. Thanks!

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-19 Thread Albert Astals Cid
aacid added a comment.


  In D10455#209288 , @fahadalsaidi 
wrote:
  
  > @aacid Thanks for the feedback.
  >
  > > And you have a memory leak in your function as reported by
  >
  > valgrind shows that the leak happen in:
  >
  >   makeWordFromCharacters(QList const&, int, int) 
(textpage.cpp:1209)
  >
  >
  > which needs to be fix but not in this patch since it is irrelevant to it.
  
  
  No, the new code is causing that leak.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-18 Thread Fahad Al-Saidi
fahadalsaidi added a comment.


  @aacid Thanks for the feedback.
  
  > I'm still unconvinced the text reversion is happening at the proper level.
  
  I really a new to okular's code, so please guide me to the right direction.
  
  > And you have a memory leak in your function as reported by
  
  valgrind shows that the leak happen in:
  
makeWordFromCharacters(QList const&, int, int) 
(textpage.cpp:1209)
  
  which needs to be fix but not in this patch since it is irrelevant to it.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-18 Thread Fahad Al-Saidi
fahadalsaidi updated this revision to Diff 27517.
fahadalsaidi added a comment.


  - use QTRY_COMPARE

REPOSITORY
  R223 Okular

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D10455?vs=27043=27517

BRANCH
  master

REVISION DETAIL
  https://phabricator.kde.org/D10455

AFFECTED FILES
  autotests/data/arabic-search-test.pdf
  autotests/searchtest.cpp
  core/textpage.cpp

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-18 Thread Albert Astals Cid
aacid added a comment.


  In D10455#208549 , @fahadalsaidi 
wrote:
  
  > @aacid  any feedback?
  
  
  I'm still unconvinced the text reversion is happening at the proper level.
  
  Anyway meanwhile please change the test so that uses qtry_compare instead of 
the qtimer loop (i just fixed that for the other test)
  
  And you have a memory leak in your function as reported by 
  valgrind --leak-check=full ./autotests/searchtest
  please fix.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-17 Thread Fahad Al-Saidi
fahadalsaidi added a comment.


  @aacid  any feedback?

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, simgunz, aacid


D10455: Add RTL support for search, copy & paste in pdf

2018-02-13 Thread Fahad Al-Saidi
fahadalsaidi retitled this revision from "add rtl support to textpage" to "Add 
RTL support for search, copy & paste in pdf".
fahadalsaidi edited the summary of this revision.

REPOSITORY
  R223 Okular

REVISION DETAIL
  https://phabricator.kde.org/D10455

To: fahadalsaidi, #okular, aacid, ltoscano
Cc: ngraham, michaelweghorn, aacid