Hi,

I don't have much time as I have to study and write a report for it,
so please don't expect further reply or testing from me. That said,
I do see some issues with your code:
(a) The use of the old-style cast (long)individual_character
(b) You assume that that 8-bit unit (byte) of the string is one full
    character, however in some encodings it's only a part of one character
(c) Relatedly, the values you get from GetCharWidth() may look valid even
    just by coincidence when you call the 8-bit version with such a part
    of a character encoded as more than one byte (there's an overload of
    that method accepting pdf_utf16be)
So could you please try the following code (it's an excerpt, I hope it's well-
understandable what section in your code it replaces), it should work as long as
you don't have a character outside the Unicode Basic Multilingual Plane (BMP):

// remove the definition of TJ_string above or replace the type with PdfString
// and the name with TJ_pdfstring

for(int x = 0; x < size_of_array; x++)
{
    if(pArray[x].IsHexString()|| pArray[x].IsString())
    {
        PdfString TJ_pdfstring = pArray[x].GetString().ToUnicode();

        // Now to test the font metrics.
        for (int s = 0; s < TJ_pdfstring.GetUnicodeLength(); s++)
        {
            pdf_utf16be individual_character = *(TJ_pdfstring.GetUnicode() + s);
            cout << "Font Size is: " << met->GetFontSize() << endl;
                               cout << "Character Width is: " << 
met->CharWidth(individual_character) << endl;

            // NOW ATTEMPT TO EXTRACT GLYPH INFORMATION (FOR BMP CHARACTERS 
ONLY)

            pdf_uint16 UnicodeCharId;
#ifdef PODOFO_IS_LITTLE_ENDIAN // an alternative is podofo_is_little_endian(), 
but at runtime

            UnicodeCharId = (individual_character << 8) | (individual_character 
>> 8);
#else
            UnicodeCharId = individual_character;
#endif
            if (*Font instanceof PdfFontType1Base14) // those make a difference 
Unicode/char_code

            {
                PdfFontMetricsBase14* base14met = 
dynamic_cast<PdfFontMetricsBase14*>(met);

                GlyphID = 
base14met->GetGlyphIdUnicode(static_cast<long>(UnicodeCharId));
            }
            else
            {

                GlyphID = met->GetGlyphId(static_cast<long>(UnicodeCharId));
            }
                               cout << "************************* THE GLYPH ID 
IS: " << GlyphID << endl;
        }

    }
}

To all experienced PoDoFo and PoDoFo-using developers: Feel free to comment on 
this.
I hope this helps, even though it is untested (I need the time for studying, 
sorry).

Best regards, mabri


________________________________

From: Svetlana Watkins <svetlana.watk...@gmail.com>
To: "podofo-users@lists.sourceforge.net" <podofo-users@lists.sourceforge.net> 
Sent: 14:15 Thursday, 22 October 2015
Subject: Re: [Podofo-users] GetGlyphID in PoDoFo



I don't want to be the drama queen of this site but I am really stumped by this 
problem. It's costing us time at the moment and I'm determined to solve it.   
Maybe there is someone out there who has used this 'method' who can explain 
what I have done wrong here.  I have provided a small program as a reference.  

Thank you so much.  Sorry to be a pest but it's costing me allot timewise.  Is 
anyone familiar with this problem?   If the question is not clear feel free to 
tell me so.  Thanks




On Tue, Oct 20, 2015 at 2:22 PM, Svetlana Watkins <svetlana.watk...@gmail.com> 
wrote:

I am now using the latest PODOFO version from SVN and the problem is still 
there. Can anybody help with this.  Thanks
>
>
>
>---------- Forwarded message ----------
>From: Svetlana Watkins <svetlana.watk...@gmail.com>
>Date: Tue, Oct 20, 2015 at 1:49 PM
>Subject: GetGlyphID in PoDoFo
>To: "podofo-users@lists.sourceforge.net" <podofo-users@lists.sourceforge.net>
>
>
>
>I am having trouble getting the PdfFontMetrics::GetGlyphID(long lunicode) 
>function to work.  I have pasted some very simplified code below.  For the 
>current font some functions are working as shown below.  For example I can 
>successfully get the Char Width and Font Size for each character but for some 
>reason not the Glyph ID.  I am using the latest version on the Podofo website 
>0.9.3 but am having trouble with the subversion link (i'm using SV Tortoise to 
>download the files.  Could it be that this issue has already been fixed with a 
>patch that is only incorporated in the latest SVN version?
>
>
>Could someone please test my code below and maybe indicate what I have done 
>wrong here?   If you look down to the inline comment
>"// NOW ATTEMPT TO EXTRACT GLYPH INFORMATION" just below that is where I have 
>applied the GetGlyphID method.   Thanks
>
>
>
>
>
>
>
>
>
>#include <iostream>
>#include <string>
>#include <cstdlib>
>#include <podofo/podofo.h>
>#include <stack>
>
>
>
>
>using namespace PoDoFo;
>using namespace std;
>
>
>
>
>PdfMemDocument doc;
>int page_count;
>PdfPage *page;
>
>
>EPdfContentsType type;
>PdfVariant var;
>const char* token;
>
>
>PdfFont *Font;
>const PdfFontMetrics *met;
>
>
>PdfArray pArray; // for extracting text under TJ operator
>int size_of_array;
>string TJ_string;
>
>
>stack<PdfVariant> PdfStack;
>
>
>// THE PURPOSE OF THIS SMALL PROGRAM IS TO SHOW HOW I AM EXTRACTING THE GLYPH 
>ID.
>long GlyphID;
>
>
>
>
>int main(int argc, char **argv)
>{
>try{
>
>
>doc.Load(argv[1]);
>page_count = doc.GetPageCount();
>
>
>for(int i = 0; i < page_count;i++)
>{
>    page = doc.GetPage(i);
>    PdfContentsTokenizer tokenizer(page); // tokenize page
>
>
>
>
>
>
>while(tokenizer.ReadNext(type,token,var))
>{
>
>
>    if(type==ePdfContentsType_Keyword)
>    {
>        string keyword;
>        keyword = token;
>
>
>       if(keyword == "Tf")
>       {
>
>
>           PdfStack.pop(); //pop the font size off the stack.
>           PdfName name_of_font = PdfStack.top().GetName();
>           PdfObject *ofont = 
> page->GetFromResources(PdfName("Font"),name_of_font);
>           Font = doc.GetFont(ofont);
>           met = Font->GetFontMetrics(); // get the font metrics for current 
> font.
>           //met is global.
>       }
>
>
>       if(keyword == "TJ")
>        {
>
>
>            pArray = PdfStack.top().GetArray();
>            PdfStack.pop();
>            size_of_array = pArray.GetSize();
>
>
>            for(int x = 0; x < size_of_array; x++)
>            {
>                if(pArray[x].IsHexString()|| pArray[x].IsString())
>                {
>                    TJ_string = pArray[x].GetString().GetString();
>
>
>                    // Now to test the font metrics.
>                   for (int s = 0; s < TJ_string.length(); s++)
>                    {
>                        unsigned char individual_character = TJ_string[s];
>
>
>                    //THE BELOW CALL TO CURRENT FONT DATA WORKS FINE
>                    cout << "Font Size is: " << met->GetFontSize() << endl;
>                    cout << "Character Width is: " << 
> met->CharWidth(individual_character) << endl;
>
>
>                    // NOW ATTEMPT TO EXTRACT GLYPH INFORMATION
>                    GlyphID = met->GetGlyphId((long)individual_character);
>                    cout << "************************* THE GLYPH ID IS: " << 
> GlyphID << endl;
>
>
>
>
>                    }
>
>
>                }
>            }
>
>
>
>
>        }
>
>
>
>
>    }
>    else if(type==ePdfContentsType_Variant)
>    {
>        PdfStack.push(var);
>    }
>
>
>
>
>}
>
>
>}
>}
>catch(PdfError &err)
>{
>    cout << "The Error is: " << err.what() << endl;
>}
>
>
>
>
>}
>


------------------------------------------------------------------------------


_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

------------------------------------------------------------------------------
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to