-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 29.06.2012 08:23, Adam Reichold wrote:
> On 29.06.2012 01:49, Albert Astals Cid wrote:
>> El Dijous, 28 de juny de 2012, a les 17:53:45, Adam Reichold va 
>> escriure: Hello,
> 
>> If I remember correctly, some time ago someone proposed caching 
>> the TextOuputDev/TextPage used in Poppler::Page::search to 
>> improve performance. Instead, I would propose to add another 
>> search method to Poppler::Page which searches the whole page at 
>> once and returns a list of all occurrences.
> 
>> Applications using the qt4 frontend and this method could then 
>> decide whether to cache this information or not. The 
>> implementation of the current search method would not change.
> 
>> The appended patch does this. But the two search methods share 
>> some duplicate code. I am not sure what the best way to avoid 
>> this is.
> 
>>> First concern, QRectF uses float (instead of double) in some 
>>> architectures, like ARM, so you are actually losing precision 
>>> (that's why the double variant of search() exists). I'm not 
>>> sure we should worry about that, but we probably should. 
>>> Imagine you get the list of matches with the search() that 
>>> returns the list and then try to use it with the ::search() 
>>> that accepts a QRectF (though actually doesn't make much
>>> sense) to get the "next" item. That will cause the
>>> float->double to go wrong and you might always end up in the
>>> same item because of the truncation.
> 
>>> On the other hand using a list of QRectF is much more 
>>> convinient and probably has enough precision for painting, so 
>>> maybe we can just document that you should not use the results 
>>> of the ::search() that returns a list as input for the other 
>>> ::search() variants?
> 
>>> Opinions?
> 
> 
> Personally, I think that it would be nicer doing it that way. 
> Especially since you will still get a deprecation warning if you 
> call ::search with a QRectF as an argument.
> 
> Regards, Adam.
> 

Updated the patch to include the warning.

> 
>>> Albert
> 
> 
>> Testing this with some sample files shows large improvements 
>> (above 100% as measured by runtime) for searching the whole 
>> document and especially for short phrases that occur often.
> 
>> Thanks for any comments and advice. Best regards, Adam. 
>> _______________________________________________ poppler mailing 
>> list [email protected] 
>> http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> 
> _______________________________________________ poppler mailing 
> list [email protected] 
> http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> 

- -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJP7U0bAAoJEPSSjE3STU34UtIIALWeAsaSQ8KsDhaiJrpV13xB
O7Lj682VH3abzv7RkrMR07NBMu/cHOFSAfaS5LxLxDD/Wervoqfm1UZ2LwsCSwO3
TodC0TU5pf5Im1xoyz92rNfkBqPRVvlFLw10BSVVWBaIvhLKOQgHj7s1U6LdqoOa
PnT65/Gu1yJZO3yvSJN22t5ST5gtSYikkpjqaQ3Ts+D2XJQXJuRllZGaniR+fjel
FS+V5Gj1Zi3Fjg8aLESGdTn0KosV3R8aUaacGKXhO6klxwZMI1vv+nyGsFdNTMGL
u6c5zjD0QtKGEYx/XNhKQAf1BqnpH9qbwnSHm9xgFzVV/TC7z3DbPYT1ahttGpY=
=VHDH
- -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJP7U1LAAoJEPSSjE3STU34YNAH/iA6FZJ6Y25h7Y08Zjxbcb3t
ca16sbW8LFJ4A/ZNYP9D49wHIfhxADd2+HZ4g+jkKtNRgxv1PoqJKXH6BwQKH5lx
RAccyeykkZVVZlXBY0vRJ91N/xwxjgmyIodSqx0xqYpbnwm7AL2hDhF1pKfx6dr2
pHCqzJlkJyz4PVMHsUG8qrxz1Bot2/Vqr/CNdKpveqTLegBzFVCRE2MKhzkVC3Qm
x8bm8mNRKVoFz5aaq8ZJhaWzXN9INdKLhf1y7ZO0lapr+IMp9v3+Vbxmkz0Iznr6
QWAJ2stQ256EEbpJZoDjDnXjw8FISTIUk/f7Fh0P4E/bmzK/4sPdatUVlGcOXAQ=
=Idbg
-----END PGP SIGNATURE-----
>From 46cc6f78fe17df89751e31f1fed8cf89dca64858 Mon Sep 17 00:00:00 2001
From: Adam Reichold <[email protected]>
Date: Thu, 28 Jun 2012 17:42:17 +0200
Subject: [PATCH 1/2] add whole-page search method to Poppler::Page

---
 qt4/src/poppler-page.cc | 38 ++++++++++++++++++++++++++++++++++++++
 qt4/src/poppler-qt4.h   |  9 +++++++++
 2 Dateien geändert, 47 Zeilen hinzugefügt(+)

diff --git a/qt4/src/poppler-page.cc b/qt4/src/poppler-page.cc
index 6a16d03..6a794e3 100644
--- a/qt4/src/poppler-page.cc
+++ b/qt4/src/poppler-page.cc
@@ -427,6 +427,44 @@ bool Page::search(const QString &text, QRectF &rect, SearchDirection direction,
   return found;
 }
 
+QList<QRectF> Page::search(const QString &text, SearchMode caseSensitive, Rotation rotate) const
+{
+  const QChar * str = text.unicode();
+  int len = text.length();
+  QVector<Unicode> u(len);
+  for (int i = 0; i < len; ++i) u[i] = str[i].unicode();
+
+  GBool sCase;
+  if (caseSensitive == CaseSensitive) sCase = gTrue;
+  else sCase = gFalse;
+
+  int rotation = (int)rotate * 90;
+  
+  QList<QRectF> results;
+  double sLeft = 0.0, sTop = 0.0, sRight = 0.0, sBottom = 0.0;
+  
+  TextOutputDev td(NULL, gTrue, 0, gFalse, gFalse);
+  m_page->parentDoc->doc->displayPage( &td, m_page->index + 1, 72, 72, rotation, false, true, false );
+  TextPage *textPage=td.takeText();
+  
+  while(textPage->findText( u.data(), len, 
+        gFalse, gTrue, gTrue, gFalse, sCase, gFalse, gFalse, &sLeft, &sTop, &sRight, &sBottom ))
+  {
+      QRectF result;
+      
+      result.setLeft(sLeft);
+      result.setTop(sTop);
+      result.setRight(sRight);
+      result.setBottom(sBottom);
+      
+      results.append(result);
+  }
+  
+  textPage->decRefCnt();
+
+  return results;
+}
+
 QList<TextBox*> Page::textList(Rotation rotate) const
 {
   TextOutputDev *output_dev;
diff --git a/qt4/src/poppler-qt4.h b/qt4/src/poppler-qt4.h
index 827ea53..a363f78 100644
--- a/qt4/src/poppler-qt4.h
+++ b/qt4/src/poppler-qt4.h
@@ -602,6 +602,15 @@ delete it;
 	   \since 0.14
 	**/
 	bool search(const QString &text, double &rectLeft, double &rectTop, double &rectRight, double &rectBottom, SearchDirection direction, SearchMode caseSensitive, Rotation rotate = Rotate0) const;
+	
+	/**
+	   Returns a list of all occurences of the specified text on the page.
+	   
+	   \param text the text to search
+	   \param caseSensitive whether to be case sensitive
+	   \param rotate the rotation to apply for the search order
+	**/
+	QList<QRectF> search(const QString &text, SearchMode caseSensitive, Rotation rotate = Rotate0) const;
 
 	/**
 	   Returns a list of text of the page
-- 
1.7.11.1


>From 3555861194c00fc6ae84891a73f6f0e51921b0f7 Mon Sep 17 00:00:00 2001
From: Adam Reichold <[email protected]>
Date: Fri, 29 Jun 2012 08:35:03 +0200
Subject: [PATCH 2/2] fixed a typo and added a warning in the comment

---
 qt4/src/poppler-qt4.h | 4 +++-
 1 Datei geändert, 3 Zeilen hinzugefügt(+), 1 Zeile entfernt(-)

diff --git a/qt4/src/poppler-qt4.h b/qt4/src/poppler-qt4.h
index a363f78..7caf4f4 100644
--- a/qt4/src/poppler-qt4.h
+++ b/qt4/src/poppler-qt4.h
@@ -604,11 +604,13 @@ delete it;
 	bool search(const QString &text, double &rectLeft, double &rectTop, double &rectRight, double &rectBottom, SearchDirection direction, SearchMode caseSensitive, Rotation rotate = Rotate0) const;
 	
 	/**
-	   Returns a list of all occurences of the specified text on the page.
+	   Returns a list of all occurrences of the specified text on the page.
 	   
 	   \param text the text to search
 	   \param caseSensitive whether to be case sensitive
 	   \param rotate the rotation to apply for the search order
+	   
+	   \warning Do not use the returned QRectF as arguments of another search call because of truncation issues if qreal is defined as float.
 	**/
 	QList<QRectF> search(const QString &text, SearchMode caseSensitive, Rotation rotate = Rotate0) const;
 
-- 
1.7.11.1


_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to