Your message dated Sat, 03 Mar 2018 10:05:45 +0000
with message-id <e1es42z-000cgf...@fasolo.debian.org>
and subject line Bug#891699: Removed package(s) from unstable
has caused the Debian Bug report #423586,
regarding referencer: PDF-scraping for DOIs sometimes cuts them off in the
middle
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)
--
423586: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=423586
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: referencer
Version: 1.0.2-1
Severity: normal
I have a number of PDFs with DOIs appearing in the text, but that
Referencer cannot properly scrape out. There is no true metadata in the
PDF, so it's going for text extraction from the page body. The complete
BT/ET block containing the DOI is at the end of this message, but the
key bit is this:
[(doi:10.1016/)14.5(S)-95.3(0)]TJ
6.3307 0 TD
0.0983 Tc
[(010-0277\(02\)00)-6.3(235-4)]TJ
ET
This causes libpoppler to feed this text to BibData::guessDoi():
doi:10.1016/S 0 0 1 0 - 0 2 7 7 ( 0 2 ) 0 0 2 3 5 - 4\n
"10.1016/S" is what Referencer records as the DOI. The correct DOI is the above
string with all the spaces taken out, i.e. 10.1016/S0010-0277(02)00235-4 .
Unfortunately, I don't have any concrete suggestion for how guessDoi() could
do a better job in this case without also screwing up other situations (where
random text appears immediately after the DOI, separated only by a space).
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.18-4-686 (SMP w/2 CPU cores)
Locale: LANG=en_US, LC_CTYPE=en_US (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages referencer depends on:
ii libart-2.0-2 2.3.19-3 Library of functions for 2D graphi
ii libatk1.0-0 1.18.0-2 The ATK accessibility toolkit
ii libbonobo2-0 2.18.0-2 Bonobo CORBA interfaces library
ii libbonoboui2-0 2.18.0-5 The Bonobo UI library
ii libboost-regex1.33.1 1.33.1-10 regular expression library for C++
ii libc6 2.5-7 GNU C Library: Shared libraries
ii libcairo2 1.4.6-1 The Cairo 2D vector graphics libra
ii libfontconfig1 2.4.2-1.2 generic font configuration library
ii libgcc1 1:4.1.2-6 GCC support library
ii libgconf2-4 2.18.0.1-3 GNOME configuration database syste
ii libgconfmm-2.6-1c2 2.14.2-1 C++ wrappers for GConf (shared lib
ii libglade2-0 1:2.6.0-4 library to load .glade files at ru
ii libglademm-2.4-1c2a 2.6.2-2 C++ wrappers for libglade2 (shared
ii libglib2.0-0 2.12.12-1 The GLib library of C routines
ii libglibmm-2.4-1c2a 2.12.7-1 C++ wrapper for the GLib toolkit (
ii libgnome-keyring0 0.8.1-2 GNOME keyring services library
ii libgnome-vfsmm-2.6-1c2a 2.14.0-1 C++ wrappers for GnomeVFS (shared
ii libgnome2-0 2.18.0-4 The GNOME 2 library - runtime file
ii libgnomecanvas2-0 2.14.0-2 A powerful object-oriented display
ii libgnomecanvasmm-2.6-1c2a 2.14.0-1 C++ wrappers for libgnomecanvas2 (
ii libgnomemm-2.6-1c2 2.14.0-1 C++ wrappers for libgnome (shared
ii libgnomeui-0 2.18.1-2 The GNOME 2 libraries (User Interf
ii libgnomeuimm-2.6-1c2a 2.14.0-1 C++ wrappers for libgnomeui (share
ii libgnomevfs2-0 1:2.18.1-2 GNOME Virtual File System (runtime
ii libgtk2.0-0 2.10.12-1 The GTK+ graphical user interface
ii libgtkmm-2.4-1c2a 1:2.8.8-1 C++ wrappers for GTK+ 2.4 (shared
ii libice6 1:1.0.3-2 X11 Inter-Client Exchange library
ii liborbit2 1:2.14.7-0.1 libraries for ORBit2 - a CORBA ORB
ii libpango1.0-0 1.16.4-1 Layout and rendering of internatio
ii libpoppler0c2 0.4.5-5.1 PDF rendering library
ii libpopt0 1.10-3 lib for parsing cmdline parameters
ii libsigc++-2.0-0c2a 2.0.17-2 type-safe Signal Framework for C++
ii libsm6 1:1.0.2-2 X11 Session Management library
ii libstdc++6 4.1.2-6 The GNU Standard C++ Library v3
ii libx11-6 2:1.0.3-7 X11 client-side library
ii libxcursor1 1:1.1.8-2 X cursor management library
ii libxext6 1:1.0.3-2 X11 miscellaneous extension librar
ii libxfixes3 1:4.0.3-2 X11 miscellaneous 'fixes' extensio
ii libxi6 1:1.0.1-4 X11 Input extension library
ii libxinerama1 1:1.0.2-1 X11 Xinerama extension library
ii libxml2 2.6.28.dfsg-1 GNOME XML library
ii libxrandr2 2:1.2.1-1 X11 RandR extension library
ii libxrender1 1:0.9.2-1 X Rendering Extension client libra
referencer recommends no packages.
-- no debconf information
BT
7.9702 0 0 7.9702 340.5542 597.3164 Tm
[(www.elsev)11.4(ier.com/locate/co)8.9(gnit)]TJ
-32.0589 -63.7337 TD
[(0010-0277)15.5(/03/$)-299.5(-)-300.1(see)-293(front)-300.7(matter)]TJ
/F4 1 Tf
13.9915 0 TD
(\001)Tj
/F1 1 Tf
1.1666 0 TD
[(2003)-297.5(Elsevier)-289.8(Science)-293.2(B.V.)-299.7(All)-299.1(rights)-294.9(reserved.)]TJ
-15.1581 -1.2448 TD
[(doi:10.1016/)14.5(S)-95.3(0)]TJ
6.3307 0 TD
0.0983 Tc
[(010-0277\(02\)00)-6.3(235-4)]TJ
ET
--- End Message ---
--- Begin Message ---
Version: 1.2.2-2+rm
Dear submitter,
as the package referencer has just been removed from the Debian archive
unstable we hereby close the associated bug reports. We are sorry
that we couldn't deal with your issue properly.
For details on the removal, please see https://bugs.debian.org/891699
The version of this package that was in Debian prior to this removal
can still be found using http://snapshot.debian.org/.
This message was generated automatically; if you believe that there is
a problem with it please contact the archive administrators by mailing
ftpmas...@ftp-master.debian.org.
Debian distribution maintenance software
pp.
Scott Kitterman (the ftpmaster behind the curtain)
--- End Message ---