Hi, I don't have much time as I have to study and write a report for it, so please don't expect further reply or testing from me. That said, I do see some issues with your code: (a) The use of the old-style cast (long)individual_character (b) You assume that that 8-bit unit (byte) of the string is one full character, however in some encodings it's only a part of one character (c) Relatedly, the values you get from GetCharWidth() may look valid even just by coincidence when you call the 8-bit version with such a part of a character encoded as more than one byte (there's an overload of that method accepting pdf_utf16be) So could you please try the following code (it's an excerpt, I hope it's well- understandable what section in your code it replaces), it should work as long as you don't have a character outside the Unicode Basic Multilingual Plane (BMP):
// remove the definition of TJ_string above or replace the type with PdfString // and the name with TJ_pdfstring for(int x = 0; x < size_of_array; x++) { if(pArray[x].IsHexString()|| pArray[x].IsString()) { PdfString TJ_pdfstring = pArray[x].GetString().ToUnicode(); // Now to test the font metrics. for (int s = 0; s < TJ_pdfstring.GetUnicodeLength(); s++) { pdf_utf16be individual_character = *(TJ_pdfstring.GetUnicode() + s); cout << "Font Size is: " << met->GetFontSize() << endl; cout << "Character Width is: " << met->CharWidth(individual_character) << endl; // NOW ATTEMPT TO EXTRACT GLYPH INFORMATION (FOR BMP CHARACTERS ONLY) pdf_uint16 UnicodeCharId; #ifdef PODOFO_IS_LITTLE_ENDIAN // an alternative is podofo_is_little_endian(), but at runtime UnicodeCharId = (individual_character << 8) | (individual_character >> 8); #else UnicodeCharId = individual_character; #endif if (*Font instanceof PdfFontType1Base14) // those make a difference Unicode/char_code { PdfFontMetricsBase14* base14met = dynamic_cast<PdfFontMetricsBase14*>(met); GlyphID = base14met->GetGlyphIdUnicode(static_cast<long>(UnicodeCharId)); } else { GlyphID = met->GetGlyphId(static_cast<long>(UnicodeCharId)); } cout << "************************* THE GLYPH ID IS: " << GlyphID << endl; } } } To all experienced PoDoFo and PoDoFo-using developers: Feel free to comment on this. I hope this helps, even though it is untested (I need the time for studying, sorry). Best regards, mabri ________________________________ From: Svetlana Watkins <svetlana.watk...@gmail.com> To: "podofo-users@lists.sourceforge.net" <podofo-users@lists.sourceforge.net> Sent: 14:15 Thursday, 22 October 2015 Subject: Re: [Podofo-users] GetGlyphID in PoDoFo I don't want to be the drama queen of this site but I am really stumped by this problem. It's costing us time at the moment and I'm determined to solve it. Maybe there is someone out there who has used this 'method' who can explain what I have done wrong here. I have provided a small program as a reference. Thank you so much. Sorry to be a pest but it's costing me allot timewise. Is anyone familiar with this problem? If the question is not clear feel free to tell me so. Thanks On Tue, Oct 20, 2015 at 2:22 PM, Svetlana Watkins <svetlana.watk...@gmail.com> wrote: I am now using the latest PODOFO version from SVN and the problem is still there. Can anybody help with this. Thanks > > > >---------- Forwarded message ---------- >From: Svetlana Watkins <svetlana.watk...@gmail.com> >Date: Tue, Oct 20, 2015 at 1:49 PM >Subject: GetGlyphID in PoDoFo >To: "podofo-users@lists.sourceforge.net" <podofo-users@lists.sourceforge.net> > > > >I am having trouble getting the PdfFontMetrics::GetGlyphID(long lunicode) >function to work. I have pasted some very simplified code below. For the >current font some functions are working as shown below. For example I can >successfully get the Char Width and Font Size for each character but for some >reason not the Glyph ID. I am using the latest version on the Podofo website >0.9.3 but am having trouble with the subversion link (i'm using SV Tortoise to >download the files. Could it be that this issue has already been fixed with a >patch that is only incorporated in the latest SVN version? > > >Could someone please test my code below and maybe indicate what I have done >wrong here? If you look down to the inline comment >"// NOW ATTEMPT TO EXTRACT GLYPH INFORMATION" just below that is where I have >applied the GetGlyphID method. Thanks > > > > > > > > > >#include <iostream> >#include <string> >#include <cstdlib> >#include <podofo/podofo.h> >#include <stack> > > > > >using namespace PoDoFo; >using namespace std; > > > > >PdfMemDocument doc; >int page_count; >PdfPage *page; > > >EPdfContentsType type; >PdfVariant var; >const char* token; > > >PdfFont *Font; >const PdfFontMetrics *met; > > >PdfArray pArray; // for extracting text under TJ operator >int size_of_array; >string TJ_string; > > >stack<PdfVariant> PdfStack; > > >// THE PURPOSE OF THIS SMALL PROGRAM IS TO SHOW HOW I AM EXTRACTING THE GLYPH >ID. >long GlyphID; > > > > >int main(int argc, char **argv) >{ >try{ > > >doc.Load(argv[1]); >page_count = doc.GetPageCount(); > > >for(int i = 0; i < page_count;i++) >{ > page = doc.GetPage(i); > PdfContentsTokenizer tokenizer(page); // tokenize page > > > > > > >while(tokenizer.ReadNext(type,token,var)) >{ > > > if(type==ePdfContentsType_Keyword) > { > string keyword; > keyword = token; > > > if(keyword == "Tf") > { > > > PdfStack.pop(); //pop the font size off the stack. > PdfName name_of_font = PdfStack.top().GetName(); > PdfObject *ofont = > page->GetFromResources(PdfName("Font"),name_of_font); > Font = doc.GetFont(ofont); > met = Font->GetFontMetrics(); // get the font metrics for current > font. > //met is global. > } > > > if(keyword == "TJ") > { > > > pArray = PdfStack.top().GetArray(); > PdfStack.pop(); > size_of_array = pArray.GetSize(); > > > for(int x = 0; x < size_of_array; x++) > { > if(pArray[x].IsHexString()|| pArray[x].IsString()) > { > TJ_string = pArray[x].GetString().GetString(); > > > // Now to test the font metrics. > for (int s = 0; s < TJ_string.length(); s++) > { > unsigned char individual_character = TJ_string[s]; > > > //THE BELOW CALL TO CURRENT FONT DATA WORKS FINE > cout << "Font Size is: " << met->GetFontSize() << endl; > cout << "Character Width is: " << > met->CharWidth(individual_character) << endl; > > > // NOW ATTEMPT TO EXTRACT GLYPH INFORMATION > GlyphID = met->GetGlyphId((long)individual_character); > cout << "************************* THE GLYPH ID IS: " << > GlyphID << endl; > > > > > } > > > } > } > > > > > } > > > > > } > else if(type==ePdfContentsType_Variant) > { > PdfStack.push(var); > } > > > > >} > > >} >} >catch(PdfError &err) >{ > cout << "The Error is: " << err.what() << endl; >} > > > > >} > ------------------------------------------------------------------------------ _______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users ------------------------------------------------------------------------------ _______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users