Hi Petr,
Sorry for replying so late. I had lot's of stuff to do.
I commited your patch a minute ago. It is a good thing to support the missing
width key.
Creating a CID Font from an existing object will require some work as support
for this is pretty incomplete in PoDoFo at the moment.
In your example, PdfFontCID::CreateCMap should not be called in the case of
creating a font object from a PDF object. The CMap is already in the PDF file,
so you would need code to parse the CMAP from the PDF.
I am sure you can also provide a simple patch to fix the above problem! If you
need help, feel free to ask.
Regards,
Dom
On Sunday, February 27, 2011 05:59:19 pm Dominik Seichter wrote:
> ---------- Weitergeleitete Nachricht ----------
>
> Subject: [Podofo-users] PDF document with standard font and Identity-H
> encoding
> Date: Friday 25 February 2011
> From: Petr Machata <[email protected]>
> To: [email protected]
>
> Hi there,
>
> I'm using latest SVN of podofo on linux to extract text from this PDF
> document:
>
> http://fictionbook.ru/author/sanin_vladimir_markovich/zov_polyar_1_v_lovus
> hke/download.a6.pdf
>
> I have a couple more PDF files (all of them in Russian though) that
> behave the same way. The reproducer is this piece of code:
>
> #include <podofo/podofo.h>
> using namespace PoDoFo;
> int main (void) {
> PdfMemDocument doc ("sanin.pdf");
> PdfPage *page = doc.GetPage (2);
> PdfObject *res = page->GetResources ();
> PdfObject *font = res->GetDictionary ().GetKey ("Font")
> ->GetDictionary ().GetKey ("F15");
> font = doc.GetObjects ().GetObject (font->GetReference ());
> doc.GetFont (font);
> }
>
> One problem that I see in the PDF file is that font doesn't define the
> "Widths" value. It does define MissingWidth though, which I think
> could be used in lieu of that. I'm attaching my take on fixing that.
>
> Next on though I'm stuck, as then podofo goes on to create CID font, and
> in PdfFontCID::CreateCMap:
> PdfFontMetricsFreetype* pFreetype =
> dynamic_cast<PdfFontMetricsFreetype*>(m_pMetrics);
> unfortunately, that font is created at PdfFontFactory::CreateFont and
> m_pMetrics is in fact an instance of PdfFontMetricsObject.
>
> That's about the extent of what I can do without really understanding
> how the library in particular and PDF in general work. I've been only
> using podofo for about two days now, so I might be missing something.
> Any help is welcome.
>
> Thanks,
> Petr Machata
>
>
> ===File ~/fedora/podofo/podofo-0.8.4-missing-width.patch====
> ./src/PdfFontMetricsObject.h
> --- ./src/PdfFontMetricsObject.h~ 2010-10-11 13:45:42.000000000 +0200
> +++ ./src/PdfFontMetricsObject.h 2011-02-22 02:10:06.000000000 +0100
> @@ -205,6 +205,7 @@
> PdfName m_sName;
> PdfArray m_bbox;
> PdfArray m_width;
> + PdfObject *m_missingWidth;
> int m_nFirst;
> int m_nLast;
> unsigned int m_nWeight;
> ./src/PdfFontMetricsObject.cpp
> --- ./src/PdfFontMetricsObject.cpp~ 2010-09-25 12:22:30.000000000 +0200
> +++ ./src/PdfFontMetricsObject.cpp 2011-02-22 02:14:56.000000000 +0100
> @@ -41,8 +43,20 @@
> // OC 15.08.2010 BugFix: /FirstChar /LastChar /Widths are in the Font
> dictionary and not in the FontDescriptor
> m_nFirst = static_cast<int>(pFont->GetDictionary().GetKeyAsLong(
> "FirstChar", 0L ));
> m_nLast = static_cast<int>(pFont->GetDictionary().GetKeyAsLong(
> "LastChar", 0L ));
> +
> // OC 15.08.2010 BugFix: GetIndirectKey() instead of
> GetDictionary().GetKey() and "Widths" instead of "Width"
> - m_width = pFont->GetIndirectKey( "Widths" )->GetArray();
> + if( PdfObject *widths = pFont->GetIndirectKey( "Widths" ))
> + {
> + m_width = widths->GetArray();
> + m_missingWidth = NULL;
> + }
> + else
> + {
> + PdfObject *width = pDescriptor->GetDictionary().GetKey( "MissingWidth"
> );
> + if( width == NULL )
> + PODOFO_RAISE_ERROR_INFO( ePdfError_NoObject, "Font object defines
> neither Widths, nor MissingWidth values!" );
> + m_missingWidth = width;
> + }
>
> m_nWeight = static_cast<unsigned int>(pDescriptor-
>
> >GetDictionary().GetKeyAsLong( "FontWeight", 400L ));
>
> m_nItalicAngle = static_cast<int>(pDescriptor-
>
> >GetDictionary().GetKeyAsLong( "ItalicAngle", 0L ));
>
> @@ -78,7 +92,8 @@
>
> double PdfFontMetricsObject::CharWidth( unsigned char c ) const
> {
> - if( c > m_nFirst && c < m_nLast )
> + if( c >= m_nFirst && c < m_nLast
> + && c - m_nFirst < m_width.size () )
> {
> double dWidth = m_width[c - m_nFirst].GetReal();
>
> @@ -87,7 +102,10 @@
>
> }
>
> - return 0.0;
> + if( m_missingWidth != NULL )
> + return m_missingWidth->GetReal ();
> + else
> + return 0.0;
> }
>
> double PdfFontMetricsObject::UnicodeCharWidth( unsigned short c ) const
> ============================================================
>
> ---------------------------------------------------------------------------
> --- Free Software Download: Index, Search & Analyze Logs and other IT data
> in Real-Time with Splunk. Collect, index and harness all the fast moving
> IT data generated by your applications, servers and devices whether
> physical, virtual or in the cloud. Deliver compliance at lower cost and
> gain new business insights. http://p.sf.net/sfu/splunk-dev2dev
> _______________________________________________
> Podofo-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
> -------------------------------------------------------
------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users