A Dimecres, 28 de juliol de 2010, srinivas adicherla va escriure: > *Finding a way to sort the Pdf Text Blocks, find the > number of columns in a page. > > > *...@albert qt methods don't expose the selections, but if we can make the > block sortings in the backend poppler code it self, so that we can expose > to glib or qt whenever we need. How about it?
I'm always open to improvements :-) Albert > * > * > > On Wed, Jul 28, 2010 at 9:00 AM, <poppler- [email protected]>wrote: > > Send poppler mailing list submissions to > > > > [email protected] > > > > To subscribe or unsubscribe via the World Wide Web, visit > > > > http://lists.freedesktop.org/mailman/listinfo/poppler > > > > or, via email, send a message with subject or body 'help' to > > > > [email protected] > > > > You can reach the person managing the list at > > > > [email protected] > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of poppler digest..." > > > > Today's Topics: > > 1. Re: Finding a way to sort the Pdf Text Blocks, find the > > > > number of columns in a page. (Albert Astals Cid) > > > > 2. Re: Vertical or horizontal writing? (Albert Astals Cid) > > 3. FYI: embedded fonts for vertical text in PDF by MS Office > > > > 2007/2010 (suzuki toshiya) > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Tue, 27 Jul 2010 20:36:56 +0100 > > From: Albert Astals Cid <[email protected]> > > Subject: Re: [poppler] Finding a way to sort the Pdf Text Blocks, > > find > > > > the number of columns in a page. > > > > To: [email protected] > > Message-ID: <[email protected]> > > Content-Type: Text/Plain; charset="us-ascii" > > > > A Dimarts, 27 de juliol de 2010, srinivas adicherla va escriure: > > > Hi all, > > > > > > I used the poppler_page_get_selection_ > > > > > > region() to find the line rectangles of each and every line in a page. > > > From that I find the blocks, then I find the columns of the page. From > > > > the > > > > > number of columns of the page, Iam able to sort the blocks. So that the > > > selection is very good. > > > > > > Right now in poppler the selection is bit a problem. After doing all > > > > these > > > > > its almost look like Adobe Reader's Selection. > > > > > > Please give me suggestions on improving this. > > > > Carlos? The qt frontends don't expose the selection method so i think > > it's up > > to you for the moment. > > > > > I attached two files with this mail. > > > > > > getcol.c is able to sort the blocks in single/multicolumn pdfs. > > > getcolumn.c is based on the above sorting used to do the selection. > > > > > > > > > *I sent patch about getting the PDF ID from the document before. Albert > > > said it was ok. But he asked carlos ? > > > > > > Please give me the status about it. * > > > > Carlos? > > > > Albert > > > > > Thanks > > > > ------------------------------ > > > > Message: 2 > > Date: Tue, 27 Jul 2010 20:41:52 +0100 > > From: Albert Astals Cid <[email protected]> > > Subject: Re: [poppler] Vertical or horizontal writing? > > To: [email protected] > > Message-ID: <[email protected]> > > Content-Type: Text/Plain; charset="us-ascii" > > > > A Dimarts, 27 de juliol de 2010, [email protected] va escriure: > > > Dear Albert, > > > > > > On Tue, 27 Jul 2010 10:32:45 +0900 > > > > > > [email protected] wrote: > > > >>But i'd prefer you to use an enum instead of an int, at least on the > > > >>poppler- qt4 level, can you do the appropiate changes? > > > > > > > >OK, I will improve, of course. But please let me ask > > > >your comment about the appropriate design. > > > > > > > >When CMap->parse() parses CMap resource, it can load any > > > >integer value to CMap->wMode. And, The type of the return > > > >value from CMap->getWMode() (and GfxFont->getWMode()) is > > > >int. > > > > > > > >In FontInfo class, should I restrict the writing mode > > > >enumeration value to 2 correct values: 0/horizontal or > > > >1/vertical? > > > > > > > >Or, it is better to have 3 values: 0/horizontal, 1/vertical > > > >and -1 (or 2, or anything else) for broken writing mode > > > >info? > > > > Well, reading the specification it says that 0 is the default so i > > understand > > that if there is a value different than 0 or 1, 0 should be used. > > > > Albert > > > > > Just I've drafted a patch using enum type in Poppler::FontInfo::wMode > > > and its copy in Qt4/GLib/cpp binding. Please find attached > > > patch. > > > > > > -- > > > > > > But, Cobra had found the font-level writing mode detection > > > is insufficient even we restrict the scope to the PDF > > > generated by popular applications. I attached a PDF > > > including vertical text which is generated by MS Office > > > 2010 PDF generator addin. The embedded font is connected > > > with Identity-H, so my patch recognizes the font is for > > > horizontal. I try to detect the expected result by using > > > text level information. So, please don't hurry to evaluate > > > this patch. I mush work more. > > > > > > > > > Regards, > > > mpsuzuki > > > > ------------------------------ > > > > Message: 3 > > Date: Wed, 28 Jul 2010 12:29:29 +0900 > > From: suzuki toshiya <[email protected]> > > Subject: [poppler] FYI: embedded fonts for vertical text in PDF by MS > > > > Office 2007/2010 > > > > To: [email protected] > > Message-ID: <[email protected]> > > Content-Type: text/plain; charset="iso-2022-jp" > > > > Hi, > > > > When I check the PDFs generated by MS Office 2007 & 2010 > > addin, I found a difference in font embedding feature of > > them. > > > > * MS Office 2007 > > The embedded font is named with prefix "@". If I use > > MS Mincho, the font name is "@MS Mincho". Such @-prefixed > > names are legacy style. If the source document uses > > both of horizontal and vertical text, non-prefixed and > > @-prefixed font objects are embedded to the PDF. > > > > * MS Office 2010. > > The embedded font is always non-prefixed. If the source > > document uses both of horizontal and vertical text, > > single non-prefixed font object covering the glyphs in both > > texts is embeded to the PDF. > > > > For concrete examples, please find attached PDFs. > > I was thinking @-prefixed font names are only used by > > legacy application when Win32 GUI framework didn't support > > vertical text edit. Seeing such names in the applications > > in 21st century was interesting experience for me. > > > > Regards, > > mpsuzuki > > > > > > > > -------------- next part -------------- > > A non-text attachment was scrubbed... > > Name: msword2010-vert4.pdf > > Type: application/pdf > > Size: 38863 bytes > > Desc: not available > > URL: < > > http://lists.freedesktop.org/archives/poppler/attachments/20100728/d13e9f > > 5f/attachment.pdf > > > > -------------- next part -------------- > > A non-text attachment was scrubbed... > > Name: msword2007-vert.pdf > > Type: application/pdf > > Size: 50509 bytes > > Desc: not available > > URL: < > > http://lists.freedesktop.org/archives/poppler/attachments/20100728/d13e9f > > 5f/attachment-0001.pdf > > > > > > ------------------------------ > > > > _______________________________________________ > > poppler mailing list > > [email protected] > > http://lists.freedesktop.org/mailman/listinfo/poppler > > > > > > End of poppler Digest, Vol 65, Issue 48 > > *************************************** _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
