Your message dated Sun, 2 Feb 2025 20:21:34 -0500 with message-id <CAD+GYvyeuh5P4=t7b_co2vxmon79jt5nn++y9snob_gap4z...@mail.gmail.com> and subject line Re: regression: get_text doesn't work as it should and did. patch included has caused the Debian Bug report #644243, regarding regression: get_text doesn't work as it should and did. patch included to be marked as done.
This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact [email protected] immediately.) -- 644243: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=644243 Debian Bug Tracking System Contact [email protected] with problems
--- Begin Message ---Package: libpoppler-glib6 Version: 0.16.7-2+b1 Severity: important Dear Maintainer, theanks a lot for providing poppler. my problem: the poppler_page_get_text function began to return only part of the full text of a page, while pdftotext kept returning the whole text, correctly. after some research, with the help of albert from upstream, i found the problem has been introduced by brian in a selection with right to left related patch, between 0.13.2 and 0.13.3. me and albert tryed to contact brian, with no success yet. but since the offending patch was about selected text, i've modified the poppler_page_get_text function to directly get the whole text, and not to invoke the poppler_page_get_selected_text, which naturally fell victim to this bug. i hope albert, maybe together with brian, would consider applying this patch on 0.18.x, but they refused to apply it on older 0.16.x, that we have in debian. hence i'd ask you to apply it for debian users. diff --git a/glib/poppler-page.cc b/glib/poppler-page.cc index 9850d44..63f9955 100644 --- a/glib/poppler-page.cc +++ b/glib/poppler-page.cc @@ -843,13 +843,21 @@ poppler_page_get_selected_text (PopplerPage *page, char * poppler_page_get_text (PopplerPage *page) { - PopplerRectangle rectangle = {0, 0, 0, 0}; + GooString *sel_text; + double width, height; + char *result; + TextPage *text; g_return_val_if_fail (POPPLER_IS_PAGE (page), NULL); - poppler_page_get_size (page, &rectangle.x2, &rectangle.y2); + poppler_page_get_size (page, &width, &height); + + text = poppler_page_get_text_page (page); + sel_text = text->getText (0, 0, width, height); + result = g_strdup (sel_text->getCString ()); + delete sel_text; - return poppler_page_get_selected_text (page, POPPLER_SELECTION_GLYPH, &rectangle); + return result; } -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (990, 'testing'), (300, 'unstable'), (1, 'experimental') Architecture: i386 (i686) Kernel: Linux 3.0.0-1-686-pae (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages libpoppler-glib6 depends on: ii libc6 2.13-21 ii libcairo2 1.10.2-6.1 ii libfreetype6 2.4.6-2 ii libgcc1 1:4.6.1-4 ii libgdk-pixbuf2.0-0 2.24.0-1 ii libglib2.0-0 2.28.6-1 ii libpoppler13 0.16.7-2 ii libstdc++6 4.6.1-4 libpoppler-glib6 recommends no packages. libpoppler-glib6 suggests no packages. -- no debconf information
--- End Message ---
--- Begin Message ---poppler has had many soname bumps since this issue was reported (in fact they have been doing monthly soname bumps for a long time). Anyway, if you still think this is a problem, please report this issue to the poppler developers: https://gitlab.freedesktop.org/poppler/poppler/-/issues I'm closing this bug. Thank you, Jeremy BĂcha
--- End Message ---

