Re: [Podofo-users] Loading pdfs with large number of trailers / updates causes stack overflow

2024-01-30 Thread Michal Sudolsky
>
>
> right? When you set the s_maxRecursionDepth to large-enough value, will
> PoDoFo be able to open the file? Possibly also making the stack larger,
> to accommodate the recursion.
>
>
I suppose it will work fine. Using recursion there is not a very good idea.
Also there are two keys that need to be followed: "XRefStm" and
"Prev". @Francesco
Pretto  maybe you could fix that in the new podofo?
Something like this could work for example (pseudocode):

xrefs - vector of xrefs (or stack);
checked - set of already checked xrefs;
xrefs.push_back(first xref - startxref);
while(!xrefs.empty())
{
  XRefType xref = xrefs.back();
  xrefs.pop_back();
  if(checked.contains(xref))continue; // to avoid cycles
  checked.insert(xref);
  ...
  just check whether this is non-stream xref with trailer -
xrefstream contains just single "previous xref"
  xrefs.push_back(next "XRefStm" key from trailer);
  xrefs.push_back(next "Prev" key from trailer);
}



> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo SO versioning

2023-03-22 Thread Michal Sudolsky
>
>
> The rationale of "10" is really to point out it's not the first ABI
> that was produced. If you pick a single number and start from "1"
> there's nothing before other than "0". You are the second person (the
> first was Mattia) to prefer SOVERSION "1", so at this point I want to
> hear from zyx that, if I understood correctly, seems to care about
> packaging PoDoFo in RedHat based distributions.
>

Or another alternatives:

1. In openssl they skipped 2 and went from 1.1 to 3 where they also started
single digit versioning in soname. So for podofo it would be 2 for now. But
maybe the reason for this was different and their soversion reflects the
actual openssl major version.
2. Use just a major version of podofo so for 0.10.0 it would be 0 for now
(which is different from 0.9.7). Provided that the major version of podofo
is increased only due to breaking changes and that this is equivalent to
ABI changes (should it be?).



> Regards,
> Francesco
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo SO versioning

2023-03-22 Thread Michal Sudolsky
On Wed, Mar 22, 2023 at 10:39 AM Francesco Pretto  wrote:

> On Wed, 22 Mar 2023 at 06:19, zyx  wrote:
> > > I still suggest to move to a single number SO versioning
> >
> > Okay, you convinced me. You've a "green" from my side for this change.
> >
>
> Good. Also doors will always be opened to move back to a "full" X.Y
> versioning later, if for example we find tools that update the version
> automatically (and we bend CMake to use only the major component X in
> SONAME), so I think there's no need to worry too much about this
> change now.
>
>
As Y and other parts are really irrelevant I think it is better to have
just libpodofo.so.X than to have both symlink libpodofo.so.X and
actual libpodofo.so.X.Y.

Cheers,
> Francesco
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo SO versioning

2023-03-22 Thread Michal Sudolsky
>
>
> I also suggest:
> - To arbitrarily start with SOVERSION "10" because it makes more clear
> that there were other incompatible ABIs before;
>

Why not just use 1? Latest podofo is libpodofo.so.0.9.7 so why not restart
with libpodofo.so.1?


[1]
> https://access.redhat.com/documentation/it-it/red_hat_enterprise_linux/7/html/developer_guide/creating-libraries-gcc#the_soname_mechanism
> [2] https://cmake.org/cmake/help/latest/prop_tgt/SOVERSION.html
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo 0.10.0-rc1

2023-03-21 Thread Michal Sudolsky
On Tue, Mar 21, 2023 at 9:12 PM Francesco Pretto  wrote:

> On Tue, 21 Mar 2023 at 17:02, Michal Sudolsky  wrote:
> > So if I understand correctly these 3 variables will never be there at
> once?
>
> Not exactly. Based on feedback from xyz, I partially reverted the
> previous variables. We have now just a single PODOFO_BUILD_STATIC that
> will switch between a static build when true, and a shared one when
> false. PODOFO_BUILD_SHARED can't be set externally anymore (it's set
> internally only) and there's no dual target static/shared support just
> yet (it may be introduced in the future). The full list of supported
> CMake defines is here[1].
>

So in future it will be removed and replaced by PODOFO_ENABLE_STATIC_TARGET
and PODOFO_ENABLE_SHARED_TARGET? What will it build if neither is set or
both are false?


> Regards,
> Francesco
>
> [1] https://github.com/podofo/podofo/#cmake-switches
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo SO versioning

2023-03-21 Thread Michal Sudolsky
On Mon, Mar 20, 2023 at 12:46 PM Francesco Pretto  wrote:

> On Mon, 20 Mar 2023 at 11:50, Mattia Rizzolo  wrote:
> > Potentially, if you use more than one number (especially the 2-number
> > variant) you can use that to identify versions that added symbols
> > without removing something else, therefore the ABI was "increased"
> > without breaking.
> > But OTOH that just doesn't really work decently from a program
> > perspective...
> >
> >
>
> This is one thing that I would like to investigate if it can benefit
> or not really. If there are two numbers, one specifying the ABI didn't
> break, and the second with the additions, eg. 1.10, still any program
> linking to a 2-number SO version 1.11 will refuse to load with 1.10,
> even if it didn't use anything new introduced in 1.11, correct?
>

>From what I know dynamic linker does not understand semver and it just
searches for equivalent names (probably both file name and soname). So in
this example you will probably also have symlink with version 1 and version
in soname set also to 1. And others will link to version 1 and not 1.11 or
1.10 so both would potentially load with any SO user. If something needs
new functions from 1.11 but there is 1.10 in the system it should probably
be resolved/updated by the package manager?



>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo 0.10.0-rc1

2023-03-21 Thread Michal Sudolsky
On Fri, Mar 10, 2023 at 2:45 PM Francesco Pretto  wrote:

> On Fri, 10 Mar 2023 at 13:10, zyx  wrote:
> > What I'm questioning is the variable name, the PODOFO_STATIC. Why would
> > not keep the PODOFO_BUILD_STATIC name for it? Is there any advantage
> > with it?
>
> The reason was to leave PODOFO_BUILD_STATIC and PODOFO_BUILD_SHARED
> free for potential future use case when we want to forcefully build
> both the static and shared targets (this is not supported today). If
> you are unconvinced with the current naming, try to immagine these
> variables by their semantical meaning:
>
> 1) Boolean to decide if building the default library target as static
> (TRUE) or shared (FALSE) and use it for tests/tools/examples. Today
> name: PODOFO_STATIC;
> 2) Proposed future boolean* to decide if forcefully build the static
> library target when set to TRUE. My proposal was to name this
> PODOFO_BUILD_STATIC;
> 3) Proposed future boolean* to decide if forcefully build the shared
> library target when set to TRUE. My proposal was to name this
> PODOFO_BUILD_SHARED.
>

So if I understand correctly these 3 variables will never be there at once?
So for now there is just PODOFO_STATIC to decide whether to build static or
shared and later it will be removed and instead there will be
PODOFO_BUILD_STATIC and PODOFO_BUILD_SHARED so then will be possible to
build just static, just shared or both (and shared as default if both are
not ser or false?).



> Another possible naming that would more backward compatible could be:
> 1) PODOFO_BUILD_STATIC;
> 2) PODOFO_ENABLE_STATIC_TARGET;
> 3) PODOFO_ENABLE_SHARED_TARGET.
>
> Even with this naming I would still drop support for
> PODOFO_BUILD_SHARED: please, one decision -> one variable, not two
> with complementary meaning.
>
> Let me know what you think about and/or add your proposal.
>
> (*) To be added later, maybe.
>
> On Fri, 10 Mar 2023 at 14:02, Raul Metsma  wrote:
> > There is built in variable in cmake
> > BUILD_SHARED_LIBS — CMake 3.26.0-rc6 Documentation
> > cmake.org
> >
>
> I was aware of this variable, and I tried using it but the issue is
> that it doesn't grasp the fact that when unset or FALSE I would like
> the default to be shared. So I ended preferring a variable that when
> affirmatively set to TRUE it changes the default behavior, and
> avoiding the use of BUILD_SHARED_LIBS.
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for pdfParser - findToken function

2022-05-05 Thread Michal Sudolsky
>
>
> Francesco: Yes PoDoFo crashes when loading my provided file (you can also
> try to open it with podofobrowser, it also crashes with the error that the
> 'trailer' is not found).
>
> Michal: You are correct in assuming that the word 'trailer' in the
> comments would be a problem, i didnt think about that... i also dont know
> off the top of my head how to fix this easily. I havent looked at pdfmm's
> implementation but if that doesnt crash (according to Francesco) maybe
> PoDoFo should be changed to that.
>
> "What if xref itself is between trailer and startxref?": if im not
> misunderstanding it this should be prohibited according to pdf spec Version
> 1.7 section 7.5.5 "File trailer" (or my pdf spec is too old :-) )
>
I have a document from 2006. Seems your is from 2008 so it was probably
added.

But as you can see my PDFs with these prohibited things open well in any
pdf viewer I tested including acrobat. There are other "invalid" PDFs
(according to some version of pdf spec) out there which are working fine so
it is probably good practice for pdf readers to be more lenient.


> Best Regards,
>
> Dennis
>
>
> --
>
> [image: dots Software] 
>
> Dennis Voss
> Lead Programmer
>
> dots Gesellschaft für Softwareentwicklung mbH
> Schlesische Str. 27, 10997 Berlin, Germany
>
> Tel: +49 (0)30 695 799-30
>
> dennis.v...@dots.de 
> https://www.dots.de
>
> District court | Amtsgericht: Berlin Charlottenburg HRB 65201
> Managing Director | Geschäftsführer: Katsuji Kondo
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for pdfParser - findToken function

2022-04-27 Thread Michal Sudolsky
On Wed, Apr 27, 2022 at 7:56 PM Francesco Pretto  wrote:

> My report on pdfmm:
>
> 512.pdf -> OK
> 513.pdf -> OK
> 514.pdf -> OK
> rev.pdf -> FAIL
> big.pdf -> OK
> false.pdf -> OK
>
> I also created a big2.pdf (attached) that also fails on pdfmm but
> opens on Adobe, where the garbage is put just in between the numeric
> offset and the %%EOF. As you say, a better backward function should be
> created to handle such edge cases.
>

Yes I can see why (maybe you forgot to actually attach it). Also something
like false.pdf with garbage with string "startxref" after that numeric
offset should also fail.


> I think for PoDoFo 0.9.8 we could focus on just handling the specific
> issue reported by Dennis, if possible with few lines of code and not
> breaking other pdfs.
>

Just to note although it may fix podofo on 512, 513, 514 and big.pdf, it
breaks rev.pdf which is currently working. Maybe that patch could choose
max(file_size - xref_offset, lRange) and use that as buffer size as a quick
workaround. Then it will be the same as old behaviour such that buffer size
will be equal or larger than 512. One possible problem with this could be
that there is then no more functional safeguard. In my pdf with 1 GB of
garbage it would need to load whole this garbage into memory. Also when I
tested this pdf in acrobat it was loaded really fast but it was slower in
other viewers so maybe they loaded the whole file but acrobat did something
smarter.


> I'm sorry but I'm not available to work on PoDoFo 0.9.x codebase, but
> I will create test cases using the pdfs you created and fix it in
> pdfmm (which is candidate for merging to PoDoFo).
>

512.pdf could be made also with garbage between numeric offset and EOF and
then it should trigger that internal logic error also in pdfmm. I wrote
"That will be that "if( !i )" and it will probably throw such an error also
in pdfmm." but I forgot that it does not parse the trailer backwards for a
moment. I made them specifically for podofo but each can be changed to
target startxref instead of trailer.


> Regards,
> Francesco
>
>
> On Wed, 27 Apr 2022 at 18:27, Michal Sudolsky  wrote:
> >
> > Attached are 6 PDF files and all of them open well in 3 pdf viewers I
> tested.
> >
> >>
> >> so the backward search is correct, but it's better to limit it to
> "startxref".
> >>
> >> > Seems you are searching for a trailer right after xref (if I read
> that part well).
> >> >
> >>
> >> Yes, correct, that was a cleaner solution: in my case it was useful to
> >> fix some spurious warnings as the commit message says. It also
> >> improved parsing performance.
> >
> >
> > Btw I noticed some typo here "Ooffset read position to the EOF marker if
> it is not the last thing in the file".
> >
> >>
> >>
> >> > So is there actually some reason that for "i == 0" it is internal
> logic? What if startxref is precisely PDF_XREF_BUF bytes before the last
> EOF offset (m_LastEOFOffset)?
> >> >
> >>
> >> I didn't modify that code but I believe this was kind of a intended
> >> safeguard since the backward search is slow. Assuming one put a big
> >> amount of garbage also between "startxref" and "%%EOF" yes, what you
> >> say is true.
> >
> >
> > Yes, searching backward may be slow unless the whole file is loaded into
> memory (which is not really good) but this can be also done by parts see at
> bottom. And also it can search for both the trailer and startxref at once.
> >
> > 512.pdf gives error:
> >
> > PoDoFo encountered an error. Error: 8 ePdfError_InternalLogic
> > Error Description: An internal error occurred.
> >
> > That will be that "if( !i )" and it will probably throw such an error
> also in pdfmm. I still do not believe this is really intentional (rather it
> is just a bug).
> >
> > 513.pdf surprisingly works in podofo (trailer is not found by FindToken
> but i is -1 so it seeks 513 bytes backwards where is subsequently found
> trailer by IsNextToken after call to FindToken in ReadTrailer).
> >
> > 514.pdf same error as big.pdf.
> >
> >> We should test if Adobe handles arbitrary amount of
> >> garbage.
> >>
> >
> > big.pdf gives error (it has 1 MB of garbage so it is zipped):
> >
> > PoDoFo encountered an error. Error: 15 ePdfError_NoNumber
> > Error Description: A number was expected but not found.
> >
> > At the bottom of the call stack there is "Information: Unable to find
> trailer in file."

Re: [Podofo-users] Patch for pdfParser - findToken function

2022-04-26 Thread Michal Sudolsky
You have this here too (just that seems pdfmm searches backwards only for
startxref):

https://github.com/pdfmm/pdfmm/blob/master/src/pdfmm/base/PdfParser.cpp#L931-L932

So is there actually some reason that for "i == 0" it is internal logic?
What if startxref is precisely PDF_XREF_BUF bytes before the last EOF
offset (m_LastEOFOffset)?

Seems you are searching for a trailer right after xref (if I read that part
well).

On Tue, Apr 26, 2022 at 9:50 PM Francesco Pretto  wrote:

> Does PoDoFO crash without the patch? Just to know, because in pdfmm I have
> no issues, but I **heavily** cleaned/bugfixed that code, and for example I
> don't search backward the "trailer" token anymore, which is fishy and I
> believe is the source of your issues. Unfortunately not doing that is not a
> oneliner[1][2].
>
> [1]
> https://github.com/pdfmm/pdfmm/commit/98e8e8c207db8f57bbf1423c1ebd7c78292a
> [2]
> https://github.com/pdfmm/pdfmm/commit/f771490b01a5d86c87891494c9b3adc5ed7e95fb
>
> On Tue, 26 Apr 2022 at 17:42, Dennis Voss  wrote:
>
>> Hey,
>>
>> attached is a pdf-file that is failing to be parsed by PoDoFo and a patch
>> for that bug.
>>
>> Description:
>>
>> The pdf-file has comments between the EOF and the 'trailer' token. These
>> comments are 'longer' than the lRange (lookup range) provided to findToken,
>> so when we try to find the 'trailer' token we will end up somewhere in the
>> comments and fail to find the token.
>>
>> Therefore,  if the token we are looking for is equal to 'trailer' we
>> resize the buffer accordingly (nFileSize - m_nXRefOffset), this should
>> always find the 'trailer' token.
>>
>>
>> I dont know about the findToken2 function. The same code can be copied
>> over to there, but* i didn't patch *findToken2. That function seems to
>> be a bandaid for some other issue already, so i dont want to mess with
>> it... (feel free to patch it too though, i think it has the same problem).
>>
>>
>> Best Regards,
>>
>> Dennis Voss
>>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for pdfParser - findToken function

2022-04-26 Thread Michal Sudolsky
>
>
> if( !i )
> {
> PODOFO_RAISE_ERROR( ePdfError_InternalLogic );
> }
>
> If token keyword is found exactly at position 512 bytes before EOF
> (suppose pdf is larger than 512 bytes) then it will throw
> ePdfError_InternalLogic ("break" will happen when i == 0). There probably
> should have been if( i < 0 ) because if token is not found then i will be
> negative (-1) and not zero (this is also indicated in that comment where is
> i defined) and in this case that error probably should have been something
> like "invalid pdf" not "internal logic error". I would add new issue on
> github but is this resolved in pdfmm?
>
>
Actually that "if" should be probably removed altogether (if I did not miss
something) because non-existence of the token keyword is generally checked
after a call to FindToken.



> Best Regards,
>>
>> Dennis Voss
>> --
>>
>> [image: dots Software] 
>>
>> Dennis Voss
>> Lead Programmer
>>
>> dots Gesellschaft für Softwareentwicklung mbH
>> Schlesische Str. 27, 10997 Berlin, Germany
>>
>> Tel: +49 (0)30 695 799-30
>>
>> dennis.v...@dots.de 
>> https://www.dots.de
>>
>> District court | Amtsgericht: Berlin Charlottenburg HRB 65201
>> Managing Director | Geschäftsführer: Katsuji Kondo
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for pdfParser - findToken function

2022-04-26 Thread Michal Sudolsky
On Tue, Apr 26, 2022 at 5:42 PM Dennis Voss  wrote:

> Hey,
>
> attached is a pdf-file that is failing to be parsed by PoDoFo and a patch
> for that bug.
>
> Description:
>
> The pdf-file has comments between the EOF and the 'trailer' token. These
> comments are 'longer' than the lRange (lookup range) provided to findToken,
> so when we try to find the 'trailer' token we will end up somewhere in the
> comments and fail to find the token.
>

Seems if there would be word "trailer" in these comments then podofo would
confuse it for actual trailer.


> Therefore,  if the token we are looking for is equal to 'trailer' we
> resize the buffer accordingly (nFileSize - m_nXRefOffset), this should
> always find the 'trailer' token.
>
>
> What if xref itself is between trailer and startxref?



> I dont know about the findToken2 function. The same code can be copied
> over to there, but* i didn't patch *findToken2. That function seems to be
> a bandaid for some other issue already, so i dont want to mess with it...
> (feel free to patch it too though, i think it has the same problem).
>
>
>
Also not really related to attached patch but this looks really weird:

int i; // Do not make this unsigned, this will cause infinte loops in
files without trailer

for( i = lXRefBuf - nTokenLen; i >= 0; i-- )
{
if( strncmp( m_buffer.GetBuffer()+i, pszToken, nTokenLen ) == 0 )
{
break;
}
}

if( !i )
{
PODOFO_RAISE_ERROR( ePdfError_InternalLogic );
}

If token keyword is found exactly at position 512 bytes before EOF (suppose
pdf is larger than 512 bytes) then it will throw ePdfError_InternalLogic
("break" will happen when i == 0). There probably should have been if( i <
0 ) because if token is not found then i will be negative (-1) and not zero
(this is also indicated in that comment where is i defined) and in this
case that error probably should have been something like "invalid pdf" not
"internal logic error". I would add new issue on github but is this
resolved in pdfmm?

Best Regards,
>
> Dennis Voss
> --
>
> [image: dots Software] 
>
> Dennis Voss
> Lead Programmer
>
> dots Gesellschaft für Softwareentwicklung mbH
> Schlesische Str. 27, 10997 Berlin, Germany
>
> Tel: +49 (0)30 695 799-30
>
> dennis.v...@dots.de 
> https://www.dots.de
>
> District court | Amtsgericht: Berlin Charlottenburg HRB 65201
> Managing Director | Geschäftsführer: Katsuji Kondo
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Test EncodingTest::testDifferencesEncoding() now properly fixed

2022-04-26 Thread Michal Sudolsky
>
> I tried the code you supplied in pdfmm: if the found font has all the
> required GIDs, and the standard14 Helvetica actually doesn't have all of
> them so I used Arial as a fallback, I can already handle the text
> correctly, see the attachment.


As I remember you did some unification regarding PdfString and UTF-8 in
pdfmm so if the source file is interpreted as in UTF-8 encoding I can see
why in your pdf are all texts correct. And standard14 Helvetica actually
has all the required glyphs, you can see this in the pdf posted by zyx as
the third text is displayed correctly, you just need encoding which
contains them all like Win1250 (I hope that standard14 fonts are not broken
in pdfmm).

str = PdfString("ěščřABCĚŠČŘ");


As your string does not have a unicode marker it will be just copied into
the internar buffer as is. As your file is interpreted as UTF-8 now str
contains UTF-8 encoded string but treats its bytes as encoded in
PDFDocEncoding.

ustr = str.GetStringUtf8();

printf ("1) '%s'\n", ustr.c_str());
>

And in ustr you get UTF-8 garbage.


> painter.DrawText(10, 780, str);
>

And it will not be correct in pdf.

painter.DrawText(10, 740, ustr);
>
>
Now this takes UTF-8 garbage and treats it as PDFDocEncoding and you get
another garbage.



> str = PdfString((const pdf_utf8 *) "ěščřABCĚŠČŘ");
>

As this 2) is working I would suppose your file is interpreted as UTF-8
encoded. String str now contains correct UTF-16BE encoded string if I am
not wrong.

ustr = str.GetStringUtf8();
> printf ("2) '%s'\n", ustr.c_str());
>

Here printf in your environment treats ustr as UTF-8 string and prints it
correctly.

painter.DrawText(10, 700, str);
>

Now it is not surprising that this is working.



> painter.DrawText(10, 660, ustr);
>
>
But this is not because you again used the PdfString constructor as in "str
= PdfString("ěščřABCĚŠČŘ");".


> printf ("%s: wrote %d bytes: '%.*s'\n", __FUNCTION__, (int)
> output.GetLength(), (int) output.GetLength(), output.Get());
>

I suppose the output contains the same hex sequences as are in your pdf
file?


> Why do I include it here when it does not touch the r1967 change? I
> think the change in the r1967 can be correct, the problem is in the
> litePDF, not using proper PdfString constructors, similarly to the
> above test program. It can be the litePDF "counted" (even
> unintentionally) with the previous behavior, without using correct
> functions for the PdfString; or, taken it the other way around, the way
> litePDF has it done was the right way to do it before the r1967 change.
>

I now really cannot see how this all relates to r1967. The only right way
regardless of r1967 is to always pass string in correct encoding into
PdfString constructors.


> I mean, I consider this solved. I'll find a way to properly adapt the
> litePDF code to work as expected with the fixed PoDoFo. Maybe the above
> will help someone else when dealing with the lost UTF-8/Unicode
> letters.
>
> Bye,
> zyx
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Could someone tell me the encode type of HexString followed by Tj

2022-04-12 Thread Michal Sudolsky
On Tue, Apr 12, 2022 at 5:24 PM Francesco Pretto  wrote:

> On Tue, 12 Apr 2022 at 14:50, zyx  wrote:
> > there exists a text extract tool [1], which is supposed to, well, extract
> > text from the PDF files.
> > [1]
> https://sourceforge.net/p/podofo/code/HEAD/tree/podofo/branches/PODOFO_0_9_7_BRANCH/tools/podofotxtextract/
> >
>
> Correct: albeit many text related operators are not handled, that is
> the code to look in PoDoFo.
>
>
Just note that text position really does not depend on "m" or "l" operators
like that code may misleadingly suggest (correct me if I am wrong):

if( strcmp( pszToken, "l" ) == 0 ||
strcmp( pszToken, "m" ) == 0 )
{
if( stack.size() == 2 )
{
dCurPosX = stack.top().GetReal();
stack.pop();
dCurPosY = stack.top().GetReal();
stack.pop();


> Cheers,
> Francesco
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-03-21 Thread Michal Sudolsky
On Mon, Mar 21, 2022 at 9:22 AM zyx  wrote:

> On Fri, 2022-03-18 at 19:02 +0100, Michal Sudolsky wrote:
> > It will probably work only on your machine or only on the OS you
> > are using. Or it can stop working in some newer version of that OS
> > (if the font you used changes in some specific way).
>
> Hi,
> yes, it's working only as a trigger of the function, it doesn't test
> the output of it. As the function itself is defined only for little
> endian, the test "works" for little endian systems only. The test fails
> with Unsupported Font exception, but it was enough to trigger the stack
> overflow after incorrect application of the proposed change.
>
> That's what I meant with "not a perfect test" earlier.
>

First I meant it could fail to catch if there are no negative numbers in
the font (I am not sure why I thought that it would cause stack overflow
only for negative inputs). But for example Arial may not be TTF (lets
pretend that podofo supports other font types for subsetting). I mean there
should be some specific font file alongside this test.

Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-03-18 Thread Michal Sudolsky
On Sat, Mar 12, 2022 at 6:49 PM zyx  wrote:

> On Sat, 2022-03-12 at 17:45 +0100, zyx wrote:
> > I committed the patch as r2051:
> > http://sourceforge.net/p/podofo/code/2051
>
> ...and reverted in r2052:
> http://sourceforge.net/p/podofo/code/2052
>
> because the change causes stack overflow, by calling itself.
> Unfortunately, no unit test triggers this part of the code.
> I added a simple unit test in r2053:
> http://sourceforge.net/p/podofo/code/2053
>
> It's not a perfect test, but it at least triggers the Bit2Little()
> on little endian arches.
>

It will probably work only on your machine or only on the OS you are using.
Or it can stop working in some newer version of that OS (if the
font you used changes in some specific way).


> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-03-14 Thread Michal Sudolsky
On Mon, Mar 14, 2022 at 8:01 AM zyx  wrote:

> On Sat, 2022-03-12 at 19:15 +0100, Michal Sudolsky wrote:
> > How did you apply that patch?
>
> Hi,
> I applied it as always, `patch -p1 
> > There are two places where this change can happen...
>
> Aha, I see, the `patch` applied too early:
>
>patching file doc/PdfFontTTFSubset.cpp
>Hunk #1 succeeded at 65 with fuzz 2 (offset -5 lines).
>
> I did not pay any attention to the offset, because it's quote common
> that long standing patches do not apply cleanly. I also did not expect
> this could apply on two places.
>

Maybe another reason to dislike duplications ;)


> Though the current sources have:
>
>inline short Big2Little(short big)
>{
>return Big2Little(static_cast(big));
>}
>
> which is different from the patch (and which explains why the `patch`
> applied it with the offset).
>
> Is the undefined behavior fixed with the current trunk?
>

Technically before the first patch there was UB (until C++20) but probably
all major compilers will do the right thing (it may be even documented but
maybe not). First patch (current trunk) may not communicate the right
intention (until C++20, but major compilers may document that it is right)
and the second patch tries to solve that but may have its own problems (in
any C++) and union may be better (not by C++ but better guarantees by major
compilers). Probably the best is as it is now.


> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-03-12 Thread Michal Sudolsky
Ok problem is probably that it should be applied without change in
https://sourceforge.net/p/podofo/code/2047 but nevermind that patch
is unnecessary anyway (it just silences my UB sanitizer).

On Sat, Mar 12, 2022 at 7:15 PM Michal Sudolsky  wrote:

>
> On Sat, Mar 12, 2022 at 6:49 PM zyx  wrote:
>
>> On Sat, 2022-03-12 at 17:45 +0100, zyx wrote:
>> > I committed the patch as r2051:
>> > http://sourceforge.net/p/podofo/code/2051
>>
>> ...and reverted in r2052:
>> http://sourceforge.net/p/podofo/code/2052
>>
>> because the change causes stack overflow, by calling itself.
>> Unfortunately, no unit test triggers this part of the code.
>> I added a simple unit test in r2053:
>> http://sourceforge.net/p/podofo/code/2053
>
>
> How did you apply that patch? It should change line with:
>
> -return ((big << 8) & 0xFF00) | ((big >> 8) & 0x00FF);
>
> Into two lines:
>
> +unsigned short little = Big2Little(*reinterpret_cast short*>());
> +return *reinterpret_cast();
>
> There are two places where this change can happen but patch clearly states
> it should happen in function with signature "inline short Big2Little(short
> big)". Not in function "inline unsigned short Big2Little(unsigned short
> big)" as in 2051.
>
>
>
>>
>>
>> It's not a perfect test, but it at least triggers the Bit2Little()
>> on little endian arches.
>>
>> Bye,
>> zyx
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-03-12 Thread Michal Sudolsky
On Sat, Mar 12, 2022 at 6:49 PM zyx  wrote:

> On Sat, 2022-03-12 at 17:45 +0100, zyx wrote:
> > I committed the patch as r2051:
> > http://sourceforge.net/p/podofo/code/2051
>
> ...and reverted in r2052:
> http://sourceforge.net/p/podofo/code/2052
>
> because the change causes stack overflow, by calling itself.
> Unfortunately, no unit test triggers this part of the code.
> I added a simple unit test in r2053:
> http://sourceforge.net/p/podofo/code/2053


How did you apply that patch? It should change line with:

-return ((big << 8) & 0xFF00) | ((big >> 8) & 0x00FF);

Into two lines:

+unsigned short little = Big2Little(*reinterpret_cast());
+return *reinterpret_cast();

There are two places where this change can happen but patch clearly states
it should happen in function with signature "inline short Big2Little(short
big)". Not in function "inline unsigned short Big2Little(unsigned short
big)" as in 2051.



>
>
> It's not a perfect test, but it at least triggers the Bit2Little()
> on little endian arches.
>
> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-02-05 Thread Michal Sudolsky
>
>
> One's complement architectures are essentially dead.


They still live in emulators.

Even the C standard is going to acknowledge that in one of the next
> iterations.
>
>
Actually from C++20 (podofo is in C++) signed integers are two's complement
and also that original patch is unnecessary because left shift of
negative value is defined (but it at least removes code duplication).

Joerg
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2022-02-04 Thread Michal Sudolsky
>
>
> Hi,
> thanks for the patch. I applied it as [r2047].
>

Actually that patch is not correct. Here we want to reinterpret bits but
static_cast here will perform modulo 2^N arithmetics (which is
coincidentally the "same" thing on two's complement architectures).

But for example on one's complement architecture value -1 in a 16 bit big
endian is stored as "1110 " which is interpreted as -256 which
when converted to unsigned short will give 65280 in binary "
" after swapping bytes " " which is 255 as either
signed and unsigned short. So -1 would be read as 255 (if I did not make
some mistake).

Actually podofo would not work on one's complement architecture even if
this part was done correctly because I suppose font file format has defined
that negative numbers are stored as two's complement so it would need to
perform additional conversions (in other case value -1 would be read as
"-0").

Proposed quick patch using reinterpred_cast attached.

Bye,
> zyx
>
> [r2047] http://sourceforge.net/p/podofo/code/2047
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>


patch.diff
Description: Binary data
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs

2022-02-02 Thread Michal Sudolsky
+ if ( s_nRecursionDepth > s_maxRecursionDepth )
+ {
+ // avoid stack overflow on documents that have circular cross references,
loops
+ // or very deeply nested structures, can happen with
+ // /Prev entries in trailer and XRef streams (possible via a chain of
entries with a loop)
+ // /Kids entries that loop back to self or parent
+ // deeply nested Dictionary or Array objects (possible with lots of
 brackets)
+ // mutually recursive loops involving several objects are possible
+ PODOFO_RAISE_ERROR( ePdfError_InvalidXRef );
+ }

Not all these cases are invalid xref errors.

On Wed, Feb 2, 2022 at 7:16 PM Mark Rogers 
wrote:

> Hi Everyone
>
>
>
> Here are patches for recursive stack consumption, which should fix
> CVE-2018-8002, CVE-2021-30470, CVE-2021-30471,  CVE-2020-18971
>
>
>
> This works by refactoring the recursion guard and making it a nested class
> of PdfTokenizer (as it’s mostly used by the tokenizer and parser). As
> agreed earlier in this thread the patch means that PoDoFo requires C++ 11
> if compiled with PODOFO_MULTI_THREAD
>
>
>
> The patch has been tested against the CVE PoC files, and the new unit
> tests. It’s also been tested in production for 2 months on macOS (64-bit)
> and Windows (32-bit)
>
>
>
> We haven’t tested on Linux. This might be relevant for the
> ParserTest::getStackOverflowDepth() unit test method which calculates an
> overflow depth for each platform that causes stack overflow without
> exhausting the heap (although the calculation should be the same as macOS
> since they both use the same System V AMD64 ABI).
>
>
>
> Best Regards
>
> Mark
>
>
>
> Mark Rogers - mark.rog...@powermapper.com
>
> PowerMapper Software Ltd - www.powermapper.com
>
> Registered in Scotland No 362274 Quartermile 2 Edinburgh EH3 9GL
>
>
>
>
>
>
>
>
>
>
>
> *From: *Michal Sudolsky 
> *Date: *Thursday, 25 November 2021 at 18:25
> *To: *Christopher Creutzig 
> *Cc: *"podofo-users@lists.sourceforge.net" <
> podofo-users@lists.sourceforge.net>
> *Subject: *Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs
>
>
>
>
>
>
>
> On Thu, Nov 25, 2021 at 4:58 PM Christopher Creutzig <
> ccreu...@mathworks.com> wrote:
>
> > Ok, I can submit a patch which uses C++11 thread_local when 
> > PODOFO_MULTI_THREAD
> is defined. The recursion guard definition will look like this:
>
> >
>
> > #if defined(PODOFO_MULTI_THREAD)
>
> >  static int thread_local s_nRecursionDepth; // PoDoFo with threading
> support requires C++11 compiler with thread_local
>
> > #else
>
> >  static int  s_nRecursionDepth;  // PoDoFo is single threaded
>
> > #endif
>
> >
>
> > Does that work for everyone?
>
>
>
> Looks good to me, and the comment is hopefully explanation enough if
> anyone runs into a compile time error. Please do include a doc patch
> stating the requirement.
>
>
>
> Can we get a macro that creates this thread-local integer and the
> recursion guard object all in one go, with the connotation that the
> recursion guard is meant to usually be applied to each affected function
> entry point separately? (Unless that is not what it is meant to do. I think
> we could just as well make an argument for a single recursion depth counter
> per thread, which then probably should become a static member of the
> recursion guard class.)
>
>
>
> I think that it was meant that there will be just a single recursion
> counter per thread.
>
>
>
>
>
>
>
> Cheers,
>
> Christopher
>
>
>
> The MathWorks GmbH | Friedlandstr.18 | 52064 Aachen | District Court
> Aachen | HRB 8082 | Managing Directors: Bertrand Dissler, Steven D. Barbo,
> Jeanne O’Keefe
>
>
>
>
>
>
>
> *From:* Mark Rogers 
> *Sent:* Thursday, November 25, 2021 16:33
> *To:* Christopher Creutzig ; Michal Sudolsky <
> sudols...@gmail.com>
> *Cc:* podofo-users@lists.sourceforge.net
> *Subject:* Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs
>
>
>
> >>> I like this idea. As PODOFO_MULTI_THREAD will enable C++11
>
> >>Is there a chance we might get there? Who would be able to make that
> decision?
>
> >consider the decision being made. Again, from my point of view. In
>
> >other words, feel free to provide a patch with the suggested changes.
>
>
>
> Ok, I can submit a patch which uses C++11 thread_local when 
> PODOFO_MULTI_THREAD
> is defined. The recursion guard definition will look like this:
>
>
>
> #if defined(PODOFO_MULTI_THREAD)
>
>   static int thread_local s_nRecursionDepth; // 

Re: [Podofo-users] Can I create a fork?

2022-01-26 Thread Michal Sudolsky
On Wed, Jan 26, 2022 at 3:59 PM  wrote:

> Hi, can I create a fork of libpodofo? I would like make a contribution.
>

It is on SVN there are no forks. You would need to send a patch as a file.
There is also one fork on github by ceztko (but both projects are somewhat
divergent): https://github.com/pdfmm/pdfmm

Best regards,
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Test EncodingTest::testDifferencesEncoding() now properly fixed

2022-01-23 Thread Michal Sudolsky
On Sun, Jan 23, 2022 at 2:04 PM zyx  wrote:

> On 21.2.2019 v 20:18 Michal Sudolsky wrote:
> > I sent more which are still pending. At least let me know if some are
> > not acceptable so I can remove it from pending things ;)
>
> Hi,
> I just realized that the change in the [r1967] breaks font subsetting
> with replacement to Base14 fonts in litePDF using PoDoFo 0.9.7.


Is it working with the latest revision?


> Writing "ěščř" only the "š" survives. There's used
> PdfString::setFromWchar_t()
> followed by m_pFont->WriteStringToStream(...) and the
> PdfEncodingDifference::ContainsUnicodeValue() fails to find some
> letters. The base encoding is WinAnsi.
>
>
Can you elaborate what exactly you are doing? If you are calling
ContainsUnicodeValue directly then you need to provide a letter in big
endian as function signature indicates. setFromWchar_t is swapping bytes so
internally they should be big endian inside resulting PdfString and
WriteStringToStream also accepts big endian data from PdfString (and I have
well tested this part with the latest revision).

This is only a little heads-up. I do not know how to correctly address
> this yet (it feels natural to store data in the PdfString in the machine
> byte order, thus the string can be used without additional
> re-byte-swapping needed). Suggestions welcome.
>

If I am not wrong, PdfString is internally always stored as a big endian.
There is no flag that would indicate whether it is big or machine byte
order so you cannot expect that it will work if you store it in PdfString
in machine byte order.

Bye,
> zyx
>
> [r1967] https://sourceforge.net/p/podofo/code/1967/
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] UB in Big2Little

2021-11-26 Thread Michal Sudolsky
Proposed quick patch attached.

On Thu, Nov 25, 2021 at 9:44 PM Michal Sudolsky  wrote:

> This function sometimes takes negative values for example when
> contourCount is -1 (0x) which is UB "For negative a, the behavior of a
> << b is undefined.". Possible output from sanitizer:
>
> podofo/doc/PdfFontTTFSubset.cpp:73:18: runtime error: left shift of
> negative value -1
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
> podofo/doc/PdfFontTTFSubset.cpp:73:18 in
>
> inline short Big2Little(short big)
> {
> return ((big << 8) & 0xFF00) | ((big >> 8) & 0x00FF);
> }
>
> Called from here:
>
> GetData( ctx.ulGlyfTableOffset + ctx.glyphData.glyphAddress,
> , __LENGTH_WORD);
> ctx.contourCount = Big2Little(ctx.contourCount);
> if (ctx.contourCount < 0) {
> /* skeep over numberOfContours, xMin, yMin, xMax and yMax
> */
> LoadCompound(ctx, ctx.glyphData.glyphAddress + 5 *
> __LENGTH_WORD);
>
>


patch.diff
Description: Binary data
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] UB in Big2Little

2021-11-25 Thread Michal Sudolsky
This function sometimes takes negative values for example when contourCount
is -1 (0x) which is UB "For negative a, the behavior of a << b is
undefined.". Possible output from sanitizer:

podofo/doc/PdfFontTTFSubset.cpp:73:18: runtime error: left shift of
negative value -1
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
podofo/doc/PdfFontTTFSubset.cpp:73:18 in

inline short Big2Little(short big)
{
return ((big << 8) & 0xFF00) | ((big >> 8) & 0x00FF);
}

Called from here:

GetData( ctx.ulGlyfTableOffset + ctx.glyphData.glyphAddress,
, __LENGTH_WORD);
ctx.contourCount = Big2Little(ctx.contourCount);
if (ctx.contourCount < 0) {
/* skeep over numberOfContours, xMin, yMin, xMax and yMax */
LoadCompound(ctx, ctx.glyphData.glyphAddress + 5 *
__LENGTH_WORD);
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs

2021-11-25 Thread Michal Sudolsky
On Thu, Nov 25, 2021 at 4:58 PM Christopher Creutzig 
wrote:

> > Ok, I can submit a patch which uses C++11 thread_local when 
> > PODOFO_MULTI_THREAD
> is defined. The recursion guard definition will look like this:
>
> >
>
> > #if defined(PODOFO_MULTI_THREAD)
>
> >  static int thread_local s_nRecursionDepth; // PoDoFo with threading
> support requires C++11 compiler with thread_local
>
> > #else
>
> >  static int  s_nRecursionDepth;  // PoDoFo is single threaded
>
> > #endif
>
> >
>
> > Does that work for everyone?
>
>
>
> Looks good to me, and the comment is hopefully explanation enough if
> anyone runs into a compile time error. Please do include a doc patch
> stating the requirement.
>
>
>
> Can we get a macro that creates this thread-local integer and the
> recursion guard object all in one go, with the connotation that the
> recursion guard is meant to usually be applied to each affected function
> entry point separately? (Unless that is not what it is meant to do. I think
> we could just as well make an argument for a single recursion depth counter
> per thread, which then probably should become a static member of the
> recursion guard class.)
>
>
I think that it was meant that there will be just a single recursion
counter per thread.


>
>
>
> Cheers,
>
> Christopher
>
>
>
> The MathWorks GmbH | Friedlandstr.18 | 52064 Aachen | District Court
> Aachen | HRB 8082 | Managing Directors: Bertrand Dissler, Steven D. Barbo,
> Jeanne O’Keefe
>
>
>
>
>
>
>
> *From:* Mark Rogers 
> *Sent:* Thursday, November 25, 2021 16:33
> *To:* Christopher Creutzig ; Michal Sudolsky <
> sudols...@gmail.com>
> *Cc:* podofo-users@lists.sourceforge.net
> *Subject:* Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs
>
>
>
> >>> I like this idea. As PODOFO_MULTI_THREAD will enable C++11
>
> >>Is there a chance we might get there? Who would be able to make that
> decision?
>
> >consider the decision being made. Again, from my point of view. In
>
> >other words, feel free to provide a patch with the suggested changes.
>
>
>
> Ok, I can submit a patch which uses C++11 thread_local when 
> PODOFO_MULTI_THREAD
> is defined. The recursion guard definition will look like this:
>
>
>
> #if defined(PODOFO_MULTI_THREAD)
>
>   static int thread_local s_nRecursionDepth; // PoDoFo with threading
> support requires C++11 compiler with thread_local
>
> #else
>
>   static int  s_nRecursionDepth;  // PoDoFo is single threaded
>
> #endif
>
>
>
> Does that work for everyone?
>
>
>
> > Not when the user of podofo already used some 64 KB before calling
> podofo. To me it seems more reasonable to use a more conservative value
> which would not consume more than some half (or tenth?) of the available
> stack in the worst case.
>
>
>
> I’ll also reduce the 500 max recursion depth as suggested (probably to 256)
>
>
>
> And I’ll also include the new parser unit tests which test for deep
> recursion and reference loops
>
>
>
> We’re also testing a patch for CVE-2018-20797. This is caused by an
> invalid negative value for one of the FlateDecode compression parameters
> which results in a call to podofo_calloc( -14 ) == podofo_calloc(
> 0xfff2 )
>
>
>
> Best Regards
>
> Mark
>
>
>
> Mark Rogers - mark.rog...@powermapper.com
>
> PowerMapper Software Ltd - www.powermapper.com
>
> Registered in Scotland No 362274 Quartermile 2 Edinburgh EH3 9GL
>
>
>
>
>
>
>
> *From: *Christopher Creutzig 
> *Date: *Thursday, 25 November 2021 at 07:16
> *To: *Michal Sudolsky 
> *Cc: *"PowerMapper.com" , "
> podofo-users@lists.sourceforge.net" 
> *Subject: *RE: [Podofo-users] PoDoFo and recursive stack consumption CVEs
>
>
>
> >> If we want to avoid UB in the multithreaded world, I’m afraid we will
> have to make a C++11 compiler a requirement, as C++03 never acknowledged
> the existence of threads. (That is not limited to this place, a lot of
> methods like PdfEncodingFactory::GlobalPdfRomanEncodingInstance are not
> currently threadsafe in C++03, as discussed earlier.)
>
> > That is not thread-safe even in C++11.
>
>
>
> True, but C++11 or later would give us the tools to make it thread-safe.
>
>
>
> > Except that some things are not so available as threads like for
> example thread_local and atomic operations.
>
>
>
> thread_local equivalents are available for g++, clang, and MSVC. That
> covers the compilers listed in
> https://github.com/Nick

Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs

2021-11-24 Thread Michal Sudolsky
> If we want to avoid UB in the multithreaded world, I’m afraid we will have
> to make a C++11 compiler a requirement, as C++03 never acknowledged the
> existence of threads. (That is not limited to this place, a lot of methods
> like PdfEncodingFactory::GlobalPdfRomanEncodingInstance are not currently
> threadsafe in C++03, as discussed earlier.)
>
>
That is not thread-safe even in C++11.

There's no need for extra support
> on the C++ level, when there are existing tools for it already.


Except that some things are not so available as threads like for example
thread_local and atomic operations. DCLP in GlobalWinAnsiEncodingInstance
and similar would need load-acquire (or "consume" may be enough) and
store-release. Without C++11 this would need to be compiler specific for
example GCC has "__atomic" builtins and also something for thread_local. So
why were these things not fixed in podofo already a long time ago? Maybe
because it is somehow simpler to just use a simple multi-platform library
like pthread than to use different things on each compiler.


> Option 1)
>
> #if !defined(PODOFO_MULTI_THREAD)
>   static int  s_nRecursionDepth;  // PoDoFo is single threaded
> #elif defined(thread_local)
>   static int thread_local s_nRecursionDepth; // PoDoFo has threading
> support and using C++11 compiler
> #else
> #error C++11 thread_local is required for multi-thread
> #endif
>
> The thread_local macro is available in


Seems that macro is not available in C++ (it is present probably only when
compiling source as C).

I think this is a better option. I like this idea. As PODOFO_MULTI_THREAD
will enable C++11 then also other things like broken DCLP in
GlobalWinAnsiEncodingInstance can be easily fixed. It would simply use
correct atomic operations when this macro is enabled or does not use any
mutex here at all if this macro is not defined as in a single threaded
environment there is no need to use locking in
GlobalWinAnsiEncodingInstance (and in similar functions). Or it may just
use correct atomics when it is compiled with C++11 and be as it is now when
not so it would be fixed for C++11 but it will still be buggy without C++11
(it would be simpler to fix it just for C++11 as to use compiler specific
things like GCC "__atomic" builtins).

The 500 limit should be enough for a 32-bit release build with a 256 KB
> stack (it would use about 195 KB)


Not when the user of podofo already used some 64 KB before calling podofo.
To me it seems more reasonable to use a more conservative value which would
not consume more than some half (or tenth?) of the available stack in the
worst case.

Also worth discussing – should it be possible to disable the recursion
> guard completely with SetMaxRecursionDepth(0) ? This is a bad idea with
> untrusted input, but might make sense in some situations.


Not worth it. The user may just set it to 1 million or something.

Only supporting compilers that still get security updates is a simple way
> to get rid of old compilers, and easy to justify.


I agree.

(Btw will podofo run in Turbo C++ 3.0?)

>Do we agree this would implement different functionality, with the
> potential of hard to debug sporadic effects depending on how the threads
> actually run?
>
> >
> Yes, although this already happens with PoDoFo – all the PoDoFo mutex
> methods are defined as no-ops unless  PODOFO_MULTI_THREAD is defined.
>
> I do not see how having or not having these mutexes in a single-threaded
> situation makes a functional difference.


Nor does global vs thread local counter make difference in single-threaded
situation.


> What I meant was: Do we agree there is a fundamental difference between
> using thread local counters vs. global counters (atomic or behind mutexes),
> both in the PODOFO_MULTI_THREAD situation?


I suppose he meant that in a multi-threaded situation yes global counter
would mean different functionality. But also if someone forgets to define
PODOFO_MULTI_THREAD (in a multi-threaded situation) he will also get some
hard to debug sporadic effects (I think it happened to me when I first
"played" with podofo).


>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PoDoFo and recursive stack consumption CVEs

2021-11-20 Thread Michal Sudolsky
>
> Is option 3) worth investigating?


It seems like the best solution in current circumstances. Another solution
could be to eliminate all recursion but thread_local recursion counter is
simpler to do.


> #else
> // fallback to process global recursion count – overcounts depth if
> PDFs processed in parallel on multiple threads, same result as thread_local
> if process single threaded
> #define PODOFO_THREAD_LOCAL
> #endif


But this fallback would cause UB unless all access to t_nRecursionDepth is
atomic or guarded by mutex.


> In a release build on x86/x64 each recursive ReadArray call loop uses
> about 400 bytes of stack.


>- Windows IIS 32-bit worker processes – 256 KB max stack (stack
>overflows with 655 ‘[‘ characters)
>
>
So would the limit of 500 as default be enough? It should be far enough
from value which would cause stack overflow.


I had a look at the patch in
> https://sourceforge.net/p/podofo/tickets/25/#51b9. That’s a simpler
> solution than the changes I proposed for PdfRecursionGuard.
>

That patch introduces UB (same reason as above).
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] can't use exsited font to draw text

2021-10-11 Thread Michal Sudolsky
Hi,

On Mon, Oct 11, 2021 at 8:18 AM zyx  wrote:

> On Sat, 2021-10-09 at 18:25 +0800, 程丹皓 wrote:
> > as subType is Type0,I get the font from resource,and new a PdfFontCID
> > object. I tried code like below, it dosen't work, just show a blank
> > page
>
> Hi,
> I do not have an answer for your question, I never tried what you do
> here, but I do not understand why to create another CID font, when you
> have the font object already. Did you try to set the pCurFont to the
> painter instead?
>

Yes, that looks redundant and would probably work only for Type0 font (if
at all). pCurFont should work fine with the most recent podofo (maybe also
with 0.9.7) unless it is a subset font - then only characters used in the
original pdf would work.


> The second thing is that the embedded font can be a font subset, not
> containing all characters, thus when you draw with it the missing
> characters are not drawn at all (can be drawn as "missing character").
>
>
Seems fonts here are font subsets as they have prefixes like "YEYMZI+"
(F25).

As you have the PDF structure browser, check what is generated. That
> may help you identify the problem.
> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Problem with œ ( oe) character

2021-10-09 Thread Michal Sudolsky
It works normally for me (tried with Helvetica and LiberationSans, default
winansi encoding). Maybe you did something wrong somewhere (you did not
provide any detail).

On Sat, Oct 9, 2021 at 5:29 PM Gilles Maire  wrote:

>
> I have a problem withe one of compose character with Libpodofo : the
> french character œ  from cœur (coeur) . The æ
> character (ae) and all other chars work fine..
>
> With all the different font family the œ caracter is replaced by ? in
> pdf result
>
> Idea ?
>
> Thanks ans congratulation for your job
>
> --
> Gilles Maire
>
> 06 07 99 06 55
>  gilles.mair...@sfr.fr
>  68 rue Leibniz Paris 75018 (France)
>  Porte de Saint-Ouen
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] drawtext()function can not add CJK text to PDF

2021-08-01 Thread Michal Sudolsky
Hi,

There is no support for non-western writing systems. You would need to do
it yourself: https://sourceforge.net/p/podofo/mailman/message/36934272/

On Sun, Aug 1, 2021 at 1:53 PM ning  wrote:

> Deatr podofoteams:
>
> I recently used podofo to add text to the PDF, but found that the correct
> Chinese cannot be added to the PDF under linux (ubuntu). The code I used is
> the following paragraph:
>
> *pFont = pDocument->CreateFont( "Arial Unicode MS", new
> PdfIdentityEncoding( 0, 0x, true ) );*
> *printf("GOT: %s\n", pFont->GetFontMetrics()->GetFontname() );*
> *PdfString sJap(reinterpret_cast("「Po\tDoFo」中文测试"));*
> *const long lUtf8BufferLen = 256;*
> *pdf_utf8 pUtf8Buffer[lUtf8BufferLen];*
>
>
> *PdfString::ConvertUTF16toUTF8( sJap.GetUnicode(),
> sJap.GetUnicodeLength(), pUtf8Buffer, lUtf8BufferLen  );*
> *printf("UNIC: %s\n", pUtf8Buffer );*
>
> *pFont->SetFontSize( 18.0 );*
> *pPainter->SetFont( pFont );*
> *pPainter->DrawText( 100.0, 100.0, sJap );*
>
> In addition to this, I also used
>
> *const PdfEncoding* pp =
> PdfEncodingFactory::GlobalIdentityEncodingInstance();*
>
> *PdfFont* pFont = document.CreateFont("SimHei", true, true, false,
> pp);//Helvetica*
>
> *painter.SetFont( pFont );*
>
> *double w = pPage->GetPageSize().GetWidth();*
>
> *double h = pPage->GetPageSize().GetHeight();*
>
> *PdfString str((pdf_utf8* )"测试podofo显示中文文本显示是否为乱码! \n \n");*
>
> *painter.DrawMultiLineText( 0, 0, w, h, str);*
>
>
> After I use this code, I cannot find these words in the PDF, so I seek
> your help and hope to get a reply, thank you
>
> best wishes
> ning
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Some typo fixes

2021-07-20 Thread Michal Sudolsky
On Tue, Jul 20, 2021 at 10:37 AM zyx  wrote:

> On Tue, 2021-07-20 at 08:22 +0200, Christopher Creutzig wrote:
> > Changing the name of a private method is not an API change, is it?
>
> Hi,
> you are right, I also noticed it's a private method. To be honest, I do
> not know for sure. I chose a safe way.
>
> I can apply it, if you think it's okay.
>

I think it should be ok (I did not notice that it is private).


> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Some typo fixes

2021-07-19 Thread Michal Sudolsky
On Mon, Jul 19, 2021 at 6:53 PM Michal Sudolsky  wrote:

>
>
> On Mon, Jul 19, 2021 at 11:28 AM zyx  wrote:
>
>> Hi,
>>
>> On Sun, 2021-07-18 at 22:42 +0200, Michal Sudolsky wrote:
>> > Changing the name of some argument is not an API change. It has a
>> > similar effect as changing something in documentation.
>>
>> yeah, except there changed also a function name... ;)
>>
>
> I did not notice but I doubt anyone ever used that function. If
> afraid then there could be both GetIndeces and GetIndices:
>
>
GetIndeces as an alias marked as deprecated (PODOFO_DEPRECATED).


>
>> Bye,
>> zyx
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Some typo fixes

2021-07-19 Thread Michal Sudolsky
On Mon, Jul 19, 2021 at 11:28 AM zyx  wrote:

> Hi,
>
> On Sun, 2021-07-18 at 22:42 +0200, Michal Sudolsky wrote:
> > Changing the name of some argument is not an API change. It has a
> > similar effect as changing something in documentation.
>
> yeah, except there changed also a function name... ;)
>

I did not notice but I doubt anyone ever used that function. If afraid then
there could be both GetIndeces and GetIndices:



> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] String width is incorrect when using font loaded from pdf

2021-07-18 Thread Michal Sudolsky
On Sun, Jul 18, 2021 at 12:07 PM zyx  wrote:

> On Wed, 2018-12-05 at 20:34 +0100, Michal Sudolsky wrote:
> > Patch with corrected indentation attached.
>
> Hi,
> thanks for the patch. It has got lost in my mail queue (and nobody else
> picked it), I'm sorry. I committed it as r2035:
> https://sourceforge.net/p/podofo/code/2035
>
> Feel free to point me to other lost patches, in case you've any.
>

I think I added everything as tickets including this one.


> Thanks and bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Font size written differs from font size set.

2021-07-18 Thread Michal Sudolsky
On Sun, Jul 18, 2021 at 11:47 AM zyx  wrote:

> On Thu, 2021-05-27 at 18:07 +0200, Michal Sudolsky wrote:
> > Get/SetFontScale and Get/SetFontSize are inline functions so if
> > memory layout of class PdfFontMetrics is different in your exe than
> > in podofo dll then this and other bad things can happen.
>
> Hi,
> once upon a time, I suggested to make all inline functions non-inline,
> basically to remove all code execution out of the header files, which
> should help in such situations, but it had not been accepted.
>
> Should it be reconsidered, maybe?
>

This would not resolve anything but just mask the problem further. You
would need to prohibit all user allocations (including on stack) because if
memory layout is different then size of class may be different on user vs
podofo side (and inline functions are good, they are faster).


> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Some typo fixes

2021-07-18 Thread Michal Sudolsky
On Sun, Jul 18, 2021 at 11:36 AM zyx  wrote:

> On Fri, 2021-05-21 at 11:16 +0200, Christopher Creutzig wrote:
> > To make searching the generated doc easier, I made a quick pass
> > fixing a few typos in the header file comments. I did not try to
> > unify between British and US spelling and I’m sure I did not find
> > everything.
>
> Hi,
> thanks for the patch. I committed it as r2034:
> http://sourceforge.net/p/podofo/code/2034
>
> I made small changed on the lines you changed, like:
>"an Unicode" ~> "a Unicode"
>"a xref" ~> "an XRef".
>
> I included all but the changes in PdfXRefStreamParserObject.cpp/.h,
> because you are changing API, which I hesitate to do (there had been a
> similar request in the past, I do not recall which API specifically,
> and it had not been done too).
>

Changing the name of some argument is not an API change. It has a similar
effect as changing something in documentation.


>
> In any case, you did a good job with this. Thanks for it.
> Bye,
> zyx
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Font size written differs from font size set.

2021-05-27 Thread Michal Sudolsky
Sorry, that was maybe not a well educated tip. There are first few fonts
used on first page:

/Ft7 18.000 Tf
100.000 Tz
/Ft9 12.000 Tf
100.000 Tz
/Ft11 12.000 Tf
100.000 Tz

And on second page:

/Ft7 12.000 Tf
18.000 Tz
/Ft9 12.000 Tf
12.000 Tz
/Ft11 12.000 Tf
12.000 Tz

Where number before Tf is font size and number before Tz is horizontal
scaling. It looks like when podofo example is trying to set font size then
it is actually setting horizontal scaling like when using SetFontScale
instead of SetFontSize (100 is default for horizontal scaling and 12 is
default font size in podofo).

This could probably happen when is podofo library compiled somehow
differently with different options (maybe even with different C++ std
library) than that podofo example. Get/SetFontScale and Get/SetFontSize are
inline functions so if memory layout of class PdfFontMetrics is different
in your exe than in podofo dll then this and other bad things can happen.

 *protected*:

std::string   m_sFilename;

*float* m_fFontSize;

*float* m_fFontScale;

For example you can check offset of these members in exe and in dll (or
perhaps check size of std::string) and figure out what in your project
settings make them different (if they are).


On Thu, May 27, 2021 at 4:00 PM Blayne Cullen  wrote:

> I’ve had a look at the file (attached) and it’s the same as the one that
> was generated with PoDoFo. I’ve also tried uncommenting and defining 
> PODOFO_HAVE_STRINGS_H,
> PODOFO_HAVE_ARPA_INET_H and PODOFO_HAVE_MEM_H and pointing them to their
> correct files, but no luck.
>
>
>
> *From:* Michal Sudolsky 
> *Sent:* 25 May 2021 18:15
> *To:* Blayne Cullen 
> *Cc:* podofo-us...@lists.sf.net
> *Subject:* Re: [Podofo-users] Font size written differs from font size
> set.
>
>
>
> Maybe something is wrong with your "podofo_config.h"?
>
>
>
> On Tue, May 25, 2021 at 3:55 PM Blayne Cullen 
> wrote:
>
> Hello PoDoFo Team,
>
>
>
> I have come across something interesting whilst attempting to use PoDoFo.
> When writing text to a pdf document using the tools/examples within the
> supplied podofo.sln work as they should. However, when writing text outside
> of the podofo.sln the text sizes appear wrong. I have even created an exact
> duplicate of the helloworld-base14 Project in an external solution but it
> still writes incorrect text sizes. Browsing the web I have come across
> other’s reporting similar problems but having found no solution, only
> temperamental work-arounds such as tweaking the Transformation Matrix until
> the correct result are achieved. Attached is a Pdf document that contains
> two outputs of the “helloword-base14” project; the first created from
> within the supplied podofo.sln and the second in a duplicate external
> solution. I have debugged into it a bit and saw that the first error it
> makes is in PdfPainter::DrawText(Ln 810) where
> m_pFont->GetIdentifier().GetName() is being used, it picks up a different
> name in the external solution to the internal solution. The errors continue
> down the line within this function as well.
>
>
>
> In addition to the Transformation Matrix work-around, it was also
> mentioned that the generated PoDoFo library was the cause of the problem.
> I’ve rebuilt the library many times using different building tools such as
> NMake and Visual Studio, using different versions of PoDoFo source files
> (0.9.7 and R2033) and with different versions of PoDoFo’s required
> libraries: FreeType, ZLib and Jpeg; all producing the same issue.
>
>
>
> My current/target environment are as follows:
>
>- Visual Studio 2017
>- Freetype 2.10.4, Zlib 1.2.3 (preferably using 1.2.11), Jpeg-9d
>- Windows 10 (19041.985) 64bit
>- Target PoDoFo libraries: 32bit and 64bit using a shared library (dll)
>
>
>
> Some other things I have noticed (which may all lead to the same core
> issue) is that other functions that are used in other examples such as
> defining the document’s information (e.g.
> document.GetInfo()->SetCreator(…)) is throwing a NULLPTR error. Again this
> all works within the PoDoFo solution but not in any external solutions.
>
>
>
> I hope I have explained the situation well, but if you require any further
> details, please don’t hesitate to reply.
>
>
>
> Thanks,
>
> Blayne
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Font size written differs from font size set.

2021-05-25 Thread Michal Sudolsky
Maybe something is wrong with your "podofo_config.h"?

On Tue, May 25, 2021 at 3:55 PM Blayne Cullen  wrote:

> Hello PoDoFo Team,
>
>
>
> I have come across something interesting whilst attempting to use PoDoFo.
> When writing text to a pdf document using the tools/examples within the
> supplied podofo.sln work as they should. However, when writing text outside
> of the podofo.sln the text sizes appear wrong. I have even created an exact
> duplicate of the helloworld-base14 Project in an external solution but it
> still writes incorrect text sizes. Browsing the web I have come across
> other’s reporting similar problems but having found no solution, only
> temperamental work-arounds such as tweaking the Transformation Matrix until
> the correct result are achieved. Attached is a Pdf document that contains
> two outputs of the “helloword-base14” project; the first created from
> within the supplied podofo.sln and the second in a duplicate external
> solution. I have debugged into it a bit and saw that the first error it
> makes is in PdfPainter::DrawText(Ln 810) where
> m_pFont->GetIdentifier().GetName() is being used, it picks up a different
> name in the external solution to the internal solution. The errors continue
> down the line within this function as well.
>
>
>
> In addition to the Transformation Matrix work-around, it was also
> mentioned that the generated PoDoFo library was the cause of the problem.
> I’ve rebuilt the library many times using different building tools such as
> NMake and Visual Studio, using different versions of PoDoFo source files
> (0.9.7 and R2033) and with different versions of PoDoFo’s required
> libraries: FreeType, ZLib and Jpeg; all producing the same issue.
>
>
>
> My current/target environment are as follows:
>
>- Visual Studio 2017
>- Freetype 2.10.4, Zlib 1.2.3 (preferably using 1.2.11), Jpeg-9d
>- Windows 10 (19041.985) 64bit
>- Target PoDoFo libraries: 32bit and 64bit using a shared library (dll)
>
>
>
> Some other things I have noticed (which may all lead to the same core
> issue) is that other functions that are used in other examples such as
> defining the document’s information (e.g.
> document.GetInfo()->SetCreator(…)) is throwing a NULLPTR error. Again this
> all works within the PoDoFo solution but not in any external solutions.
>
>
>
> I hope I have explained the situation well, but if you require any further
> details, please don’t hesitate to reply.
>
>
>
> Thanks,
>
> Blayne
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth

2021-05-11 Thread Michal Sudolsky
On Tue, May 11, 2021 at 6:12 PM Christophe  wrote:

> Michal,
>
> Yes, I read an existing PDF file and no, I don't want to draw. I just need
> to know the exact position of each character on the page. In a general
> manner, when you parse a PDF document and need such information, you will
> assign the font char space to the current font in order to get the right
> computation.
>
>
Then you can either do computation reverse to that used by podofo. Or
somehow get dWidth as in CharWidth and do your computation. Also strings in
Tj and other commands are not unicode or ascii strings. They are actually
strings composed from glyph ids so maybe the best would be to use
GetGlyphWidth and apply Tc and other things on it - you would avoid
round-trip (and possible loses) converting string to unicode and then back
for other font metrics types.

Christophe
>
>
> Le mar. 11 mai 2021 à 18:03, Michal Sudolsky  a
> écrit :
>
>>
>>
>> On Tue, May 11, 2021 at 5:49 PM Christophe  wrote:
>>
>>> Michal,
>>>
>>> It means we can't directly call function *SetFontCharSpace* with Tc ?
>>> We have to make a computation based on font size in our applicative code.
>>> In this case, could you please elaborate and provide us the right
>>> transformation from Tc to FontCharSpace ?
>>>
>>>
>> This is how it was done seems. Why would you do that? You are parsing
>> some existing content stream and then drawing it again?
>>
>>
>>> Thank you
>>>
>>> Christophe
>>>
>>>
>>> Le mar. 11 mai 2021 à 16:23, Michal Sudolsky  a
>>> écrit :
>>>
>>>> Hi,
>>>>
>>>> On Tue, May 11, 2021 at 9:15 AM Christophe  wrote:
>>>>
>>>>> Hello Michal,
>>>>>
>>>>> My remark was relative to class  PdfFontMetricsFreetype, not
>>>>> PdfFontMetricsObject.
>>>>>
>>>>>
>>>> Yes, I meant that implementation in PdfFontMetricsFreetype (and also
>>>> PdfFontMetricsBase14) is correct but the problem is
>>>> with PdfFontMetricsObject.
>>>>
>>>> But, in your patch, you also divide font char space by 100, twice:
>>>>>
>>>>> return (dWidth * m_matrix.front().GetReal() + this->GetFontCharSpace()
>>>>> / 100.0) * this->GetFontSize() * this->GetFontScale() / 100.0;
>>>>>
>>>>>
>>>> This is same as what is used in PdfFontMetricsFreetype
>>>> and PdfFontMetricsBase14 (I just simplified the formula):
>>>>
>>>> *return* dWidth * *static_cast*<*double*>(*this*->GetFontSize() *
>>>> *this*->GetFontScale() / 100.0) / 1000.0 +
>>>>
>>>> *static_cast*<*double*>( *this*->GetFontSize() * *this*->
>>>> GetFontScale() / 100.0 * *this*->GetFontCharSpace() / 100.0);
>>>>
>>>>
>>>>> Are you sure of that ?
>>>>>
>>>>> In PDF format, the font char space (Tc) is given in PDF units, not in
>>>>> percent. Does it mean, we have to transform this Tc value into percent
>>>>> before affecting the m_fFontCharSpace variable ? If yes, how shall we
>>>>> do that ?
>>>>>
>>>>>
>>>> Both font scale and font char space are in percent in podofo. Font char
>>>> space is not directly passed as Tc. Font scale is in percent also in pdf.
>>>>
>>>> m_oss << m_pFont->GetFontCharSpace() * m_pFont->GetFontSize() /
>>>> 100.0 << " Tc" << std::endl;
>>>>
>>>>
>>>> Thank you for your help
>>>>>
>>>>> Christophe
>>>>>
>>>>>
>>>>> Le lun. 10 mai 2021 à 20:54, Michal Sudolsky  a
>>>>> écrit :
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, May 10, 2021 at 12:23 PM Christophe  wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I have a doubt relative to this method definition:
>>>>>>>
>>>>>>> double PdfFontMetricsFreetype::CharWidth( unsigned char c ) const
>>>>>>> {
>>>>>>> double dWidth = m_vecWidth[static_cast(c)];
>>>>>>>
>>>>>>> return dWidth * static_cast(this->GetFontSize() * this
>>>>>>> ->GetFontScale() / 100.0) / 1000.0 +
>>>>>>> static_cast( this->GetFontSize() * this
>>>>>>> ->GetFontScale() / 100.0 * this->GetFontCharSpace() / 100.0);
>>>>>>> }
>>>>>>>
>>>>>>> The Char Space is divided by 100.0 and I think it is an error as it
>>>>>>> is not the case for PdfFontMetricsObject::CharWidth, nor for
>>>>>>> PdfFontMetricsObject::CharWidth.
>>>>>>>
>>>>>>>
>>>>>> It is intended to be divided by 100. See comment above that function:
>>>>>>
>>>>>> /** Set the character spacing of this metrics object
>>>>>>
>>>>>>  *  *\param *fCharSpace character spacing in percent
>>>>>>
>>>>>>  */
>>>>>>
>>>>>> *inline* *void* SetFontCharSpace( *float* fCharSpace );
>>>>>>
>>>>>>
>>>>>>> What do you think ?
>>>>>>>
>>>>>>> Thank you
>>>>>>>
>>>>>>> Christophe
>>>>>>> ___
>>>>>>> Podofo-users mailing list
>>>>>>> Podofo-users@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>>>>>
>>>>>>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth

2021-05-11 Thread Michal Sudolsky
On Tue, May 11, 2021 at 5:49 PM Christophe  wrote:

> Michal,
>
> It means we can't directly call function *SetFontCharSpace* with Tc ? We
> have to make a computation based on font size in our applicative code. In
> this case, could you please elaborate and provide us the right
> transformation from Tc to FontCharSpace ?
>
>
This is how it was done seems. Why would you do that? You are parsing some
existing content stream and then drawing it again?


> Thank you
>
> Christophe
>
>
> Le mar. 11 mai 2021 à 16:23, Michal Sudolsky  a
> écrit :
>
>> Hi,
>>
>> On Tue, May 11, 2021 at 9:15 AM Christophe  wrote:
>>
>>> Hello Michal,
>>>
>>> My remark was relative to class  PdfFontMetricsFreetype, not
>>> PdfFontMetricsObject.
>>>
>>>
>> Yes, I meant that implementation in PdfFontMetricsFreetype (and also
>> PdfFontMetricsBase14) is correct but the problem is
>> with PdfFontMetricsObject.
>>
>> But, in your patch, you also divide font char space by 100, twice:
>>>
>>> return (dWidth * m_matrix.front().GetReal() + this->GetFontCharSpace() /
>>> 100.0) * this->GetFontSize() * this->GetFontScale() / 100.0;
>>>
>>>
>> This is same as what is used in PdfFontMetricsFreetype
>> and PdfFontMetricsBase14 (I just simplified the formula):
>>
>> *return* dWidth * *static_cast*<*double*>(*this*->GetFontSize() *
>> *this*->GetFontScale() / 100.0) / 1000.0 +
>>
>> *static_cast*<*double*>( *this*->GetFontSize() * *this*->
>> GetFontScale() / 100.0 * *this*->GetFontCharSpace() / 100.0);
>>
>>
>>> Are you sure of that ?
>>>
>>> In PDF format, the font char space (Tc) is given in PDF units, not in
>>> percent. Does it mean, we have to transform this Tc value into percent
>>> before affecting the m_fFontCharSpace variable ? If yes, how shall we
>>> do that ?
>>>
>>>
>> Both font scale and font char space are in percent in podofo. Font char
>> space is not directly passed as Tc. Font scale is in percent also in pdf.
>>
>> m_oss << m_pFont->GetFontCharSpace() * m_pFont->GetFontSize() / 100.0
>> << " Tc" << std::endl;
>>
>>
>> Thank you for your help
>>>
>>> Christophe
>>>
>>>
>>> Le lun. 10 mai 2021 à 20:54, Michal Sudolsky  a
>>> écrit :
>>>
>>>>
>>>>
>>>> On Mon, May 10, 2021 at 12:23 PM Christophe  wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I have a doubt relative to this method definition:
>>>>>
>>>>> double PdfFontMetricsFreetype::CharWidth( unsigned char c ) const
>>>>> {
>>>>> double dWidth = m_vecWidth[static_cast(c)];
>>>>>
>>>>> return dWidth * static_cast(this->GetFontSize() * this
>>>>> ->GetFontScale() / 100.0) / 1000.0 +
>>>>> static_cast( this->GetFontSize() * this
>>>>> ->GetFontScale() / 100.0 * this->GetFontCharSpace() / 100.0);
>>>>> }
>>>>>
>>>>> The Char Space is divided by 100.0 and I think it is an error as it is
>>>>> not the case for PdfFontMetricsObject::CharWidth, nor for
>>>>> PdfFontMetricsObject::CharWidth.
>>>>>
>>>>>
>>>> It is intended to be divided by 100. See comment above that function:
>>>>
>>>> /** Set the character spacing of this metrics object
>>>>
>>>>  *  *\param *fCharSpace character spacing in percent
>>>>
>>>>  */
>>>>
>>>> *inline* *void* SetFontCharSpace( *float* fCharSpace );
>>>>
>>>>
>>>>> What do you think ?
>>>>>
>>>>> Thank you
>>>>>
>>>>> Christophe
>>>>> ___
>>>>> Podofo-users mailing list
>>>>> Podofo-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>>>
>>>>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth

2021-05-11 Thread Michal Sudolsky
Font char space as defined in podofo is scaled by font size:

m_oss << m_pFont->GetFontCharSpace() * m_pFont->GetFontSize() / 100.0
<< " Tc" << std::endl;

Do you have an example where current implementation in
PdfFontMetricsFreetype or implementation from my patch fails?

On Tue, May 11, 2021 at 10:37 AM Christopher Creutzig <
ccreu...@mathworks.com> wrote:

> What is more: Tc is in unscaled text units, only subject to scaling by Th,
> not Tfs. I think this needs to be:
>
>
>
> return (dWidth * m_matrix.front().GetReal() * this->GetFontSize() +
> this->GetFontCharSpace())
>
>  * (this->GetFontScale() / 100.0);
>
>
>
>
>
> Cheers,
>
> Christopher
>
>
>
> The MathWorks GmbH | Friedlandstr.18 | 52064 Aachen | District Court
> Aachen | HRB 8082 | Managing Directors: Bertrand Dissler, Steven D. Barbo,
> Jeanne O’Keefe
>
>
>
>
>
>
>
> *From:* Christophe 
> *Sent:* Tuesday, May 11, 2021 9:15
> *To:* Michal Sudolsky 
> *Cc:* podofo-us...@lists.sf.net
> *Subject:* Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth
>
>
>
> Hello Michal,
>
>
>
> My remark was relative to class  PdfFontMetricsFreetype, not
> PdfFontMetricsObject.
>
>
>
> But, in your patch, you also divide font char space by 100, twice:
>
>
>
> return (dWidth * m_matrix.front().GetReal() + this->GetFontCharSpace() /
> 100.0) * this->GetFontSize() * this->GetFontScale() / 100.0;
>
>
>
> Are you sure of that ?
>
>
>
> In PDF format, the font char space (Tc) is given in PDF units, not in
> percent. Does it mean, we have to transform this Tc value into percent
> before affecting the m_fFontCharSpace variable ? If yes, how shall we do
> that ?
>
>
>
> Thank you for your help
>
>
>
> Christophe
>
>
>
>
>
> Le lun. 10 mai 2021 à 20:54, Michal Sudolsky  a
> écrit :
>
>
>
>
>
> On Mon, May 10, 2021 at 12:23 PM Christophe  wrote:
>
> Hello all,
>
>
>
> I have a doubt relative to this method definition:
>
>
>
> double PdfFontMetricsFreetype::CharWidth( unsigned char c ) const
>
> {
>
> double dWidth = m_vecWidth[static_cast(c)];
>
>
>
> return dWidth * static_cast(this->GetFontSize() * this
> ->GetFontScale() / 100.0) / 1000.0 +
>
> static_cast( this->GetFontSize() * this->GetFontScale() /
> 100.0 * this->GetFontCharSpace() / 100.0);
>
> }
>
>
>
> The Char Space is divided by 100.0 and I think it is an error as it is not
> the case for PdfFontMetricsObject::CharWidth, nor for
> PdfFontMetricsObject::CharWidth.
>
>
>
>
>
> It is intended to be divided by 100. See comment above that function:
>
> /** Set the character spacing of this metrics object
>
>  *  * \param *fCharSpace character spacing in percent
>
>  */
>
> *inline* *void* SetFontCharSpace( *float* fCharSpace );
>
>
>
> What do you think ?
>
>
>
> Thank you
>
>
> Christophe
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth

2021-05-11 Thread Michal Sudolsky
Hi,

On Tue, May 11, 2021 at 9:15 AM Christophe  wrote:

> Hello Michal,
>
> My remark was relative to class  PdfFontMetricsFreetype, not
> PdfFontMetricsObject.
>
>
Yes, I meant that implementation in PdfFontMetricsFreetype (and also
PdfFontMetricsBase14) is correct but the problem is
with PdfFontMetricsObject.

But, in your patch, you also divide font char space by 100, twice:
>
> return (dWidth * m_matrix.front().GetReal() + this->GetFontCharSpace() /
> 100.0) * this->GetFontSize() * this->GetFontScale() / 100.0;
>
>
This is same as what is used in PdfFontMetricsFreetype
and PdfFontMetricsBase14 (I just simplified the formula):

*return* dWidth * *static_cast*<*double*>(*this*->GetFontSize() * *this*
->GetFontScale() / 100.0) / 1000.0 +

*static_cast*<*double*>( *this*->GetFontSize() * *this*->
GetFontScale() / 100.0 * *this*->GetFontCharSpace() / 100.0);


> Are you sure of that ?
>
> In PDF format, the font char space (Tc) is given in PDF units, not in
> percent. Does it mean, we have to transform this Tc value into percent
> before affecting the m_fFontCharSpace variable ? If yes, how shall we do
> that ?
>
>
Both font scale and font char space are in percent in podofo. Font char
space is not directly passed as Tc. Font scale is in percent also in pdf.

m_oss << m_pFont->GetFontCharSpace() * m_pFont->GetFontSize() / 100.0
<< " Tc" << std::endl;


Thank you for your help
>
> Christophe
>
>
> Le lun. 10 mai 2021 à 20:54, Michal Sudolsky  a
> écrit :
>
>>
>>
>> On Mon, May 10, 2021 at 12:23 PM Christophe  wrote:
>>
>>> Hello all,
>>>
>>> I have a doubt relative to this method definition:
>>>
>>> double PdfFontMetricsFreetype::CharWidth( unsigned char c ) const
>>> {
>>> double dWidth = m_vecWidth[static_cast(c)];
>>>
>>> return dWidth * static_cast(this->GetFontSize() * this
>>> ->GetFontScale() / 100.0) / 1000.0 +
>>> static_cast( this->GetFontSize() * this
>>> ->GetFontScale() / 100.0 * this->GetFontCharSpace() / 100.0);
>>> }
>>>
>>> The Char Space is divided by 100.0 and I think it is an error as it is
>>> not the case for PdfFontMetricsObject::CharWidth, nor for
>>> PdfFontMetricsObject::CharWidth.
>>>
>>>
>> It is intended to be divided by 100. See comment above that function:
>>
>> /** Set the character spacing of this metrics object
>>
>>  *  *\param *fCharSpace character spacing in percent
>>
>>  */
>>
>> *inline* *void* SetFontCharSpace( *float* fCharSpace );
>>
>>
>>> What do you think ?
>>>
>>> Thank you
>>>
>>> Christophe
>>> ___
>>> Podofo-users mailing list
>>> Podofo-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>
>>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth

2021-05-10 Thread Michal Sudolsky
On Mon, May 10, 2021 at 12:23 PM Christophe  wrote:

> Hello all,
>
> I have a doubt relative to this method definition:
>
> double PdfFontMetricsFreetype::CharWidth( unsigned char c ) const
> {
> double dWidth = m_vecWidth[static_cast(c)];
>
> return dWidth * static_cast(this->GetFontSize() * this
> ->GetFontScale() / 100.0) / 1000.0 +
> static_cast( this->GetFontSize() * this->GetFontScale() /
> 100.0 * this->GetFontCharSpace() / 100.0);
> }
>
> The Char Space is divided by 100.0 and I think it is an error as it is not
> the case for PdfFontMetricsObject::CharWidth, nor for
> PdfFontMetricsObject::CharWidth.
>
>
It is intended to be divided by 100. See comment above that function:

/** Set the character spacing of this metrics object

 *  *\param *fCharSpace character spacing in percent

 */

*inline* *void* SetFontCharSpace( *float* fCharSpace );


> What do you think ?
>
> Thank you
>
> Christophe
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] PdfFontMetricsFreetype::CharWidth

2021-05-10 Thread Michal Sudolsky
Hi,

There are more problems with this function. I already sent a patch for
some: https://sourceforge.net/p/podofo/mailman/message/36486334/
Among other things it delegates CharWidth to UnicodeCharWidth. I attached a
follow up patch which fixes the problem with division by 100
in UnicodeCharWidth among other things. I am not sure whether I already
posted it.


On Mon, May 10, 2021 at 12:23 PM Christophe  wrote:

> Hello all,
>
> I have a doubt relative to this method definition:
>
> double PdfFontMetricsFreetype::CharWidth( unsigned char c ) const
> {
> double dWidth = m_vecWidth[static_cast(c)];
>
> return dWidth * static_cast(this->GetFontSize() * this
> ->GetFontScale() / 100.0) / 1000.0 +
> static_cast( this->GetFontSize() * this->GetFontScale() /
> 100.0 * this->GetFontCharSpace() / 100.0);
> }
>
> The Char Space is divided by 100.0 and I think it is an error as it is not
> the case for PdfFontMetricsObject::CharWidth, nor for
> PdfFontMetricsObject::CharWidth.
>
> What do you think ?
>
> Thank you
>
> Christophe
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>


implement_glyph_width_for_object_metrics.patch
Description: Binary data
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] DCLP in PoDoFo

2021-05-07 Thread Michal Sudolsky
On Wed, May 5, 2021 at 9:34 AM Christopher Creutzig 
wrote:

> Hi Michal,
>
>
>
> Unfortunately, without memory barriers, DCLP as found in these places is
> not correct on x64 either, and I would not want to rely on
> std::lock::lock() containing fences.
>

I would rather say that it is not correct in C++.


> As I said, the worst thing likely to happen in practice is that two
> encoding objects are allocated, as writing a pointer to aligned memory is
> going to be an atomic operation in virtually all cases.
>

No, this cannot happen because allocation is covered by mutex (unless
PODOFO_MULTI_THREAD is not defined). In practice nowadays nothing bad will
happen on x64. But on ARM it can happen that some thread will access object
properties before it is fully initialised. Because it could see the
object address in cache before another writes into that object because of
missing synchronized-with relationship. In other words its view of order of
memory operations can be different (if first check is not at least
"consume" operation).

I would prefer lazy initialisation. There are encodings which are not used
by everyone. And maybe it would be better to also fix the pre-C++11 version
by removing the first check (or perhaps use version 2 on pre-C++11?). It is
problematic also for older C++ versions on processors like ARM.


>
> Personally, I would either use the safe form where available or just
> initialize all the encoding classes statically:
>
>
>
> Option 1:
>
>
>
> const PdfEncoding* PdfEncodingFactory::GlobalPdfDocEncodingInstance()
>
> {
>
> #if __cplusplus >= 201103L
>
>
>
> static PdfDocEncoding docEncoding;
>
> return 
>
>
>
> #else // pre-C++11
>
>
>
> if(!s_pDocEncoding) // First check
>
> {
>
> Util::PdfMutexWrapper wrapper( PdfEncodingFactory::s_mutex );
>
>
>
> if(!s_pDocEncoding) // Double check
>
> s_pDocEncoding = new PdfDocEncoding();
>
> }
>
>
>
> return s_pDocEncoding;
>
>
>
> #endif
>
> }
>
>
>
>
> Option 2:
>
>
>
> class PODOFO_API PdfEncodingFactory {
>
> …
>
> private:
>
> static const PdfDocEncoding docEnconding;
>
> …
>
> }
>
>
>
> const PdfEncoding* PdfEncodingFactory::GlobalPdfDocEncodingInstance()
>
> {
>
> return 
>
> }
>
>
>

>
> Cheers,
>
> Christopher
>
>
>
> *From:* Michal Sudolsky 
> *Sent:* Tuesday, May 4, 2021 17:08
> *To:* Christopher Creutzig 
> *Cc:* podofo-users@lists.sourceforge.net
> *Subject:* Re: [Podofo-users] DCLP in PoDoFo
>
>
>
> Hi,
>
>
>
> See: https://sourceforge.net/p/podofo/mailman/message/35915862/
>
>
>
> On Mon, May 3, 2021 at 2:05 PM Christopher Creutzig <
> ccreu...@mathworks.com> wrote:
>
> Hi list,
>
>
>
> PdfEncodingFactory.cpp uses a broken form of double-checked locking
> <https://preshing.com/20130930/double-checked-locking-is-fixed-in-cpp11/>
> to initialize its encoding instances. Not a big deal, as these objects
> don’t do much; technically, that is a data race and can lead to undefined
> behaviour, but realistically, I would be surprised to see anything worse
> than a small memory leak, if even that.
>
>
>
> It is undefined behaviour in C++. Actually it should cause problems only
> on weakly ordered processors like ARM. So you should never see it on x64.
>
>
>
>
>
> Does PoDoFo require C++11 or newer (where there are simple fixes
> available)? Will it ever?
>
>
>
> If podofo cannot depend on C++11 then I would suggest to use
> "single"-checked locking (just remove the first check):
>
>
>
> //if(!s_pWinAnsiEncoding) // First check
> //{
> Util::PdfMutexWrapper wrapper( PdfEncodingFactory::s_mutex );
>
> if(!s_pWinAnsiEncoding) // Double check
> s_pWinAnsiEncoding = new PdfWinAnsiEncoding();
> //}
>
> return s_pWinAnsiEncoding;
>
>
>
> Another solution would be to use some additional library that provides
> atomic primitives for older C++ but I do not think that it would be worth
> it in this case.
>
>
>
>
>
>
>
>
>
> Cheers,
>
> Christopher
>
>
>
> The MathWorks GmbH | Friedlandstr.18 | 52064 Aachen | District Court
> Aachen | HRB 8082 | Managing Directors: Bertrand Dissler, Steven D. Barbo,
> Jeanne O’Keefe
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] DCLP in PoDoFo

2021-05-04 Thread Michal Sudolsky
Hi,

See: https://sourceforge.net/p/podofo/mailman/message/35915862/

On Mon, May 3, 2021 at 2:05 PM Christopher Creutzig 
wrote:

> Hi list,
>
>
>
> PdfEncodingFactory.cpp uses a broken form of double-checked locking
> 
> to initialize its encoding instances. Not a big deal, as these objects
> don’t do much; technically, that is a data race and can lead to undefined
> behaviour, but realistically, I would be surprised to see anything worse
> than a small memory leak, if even that.
>

It is undefined behaviour in C++. Actually it should cause problems only on
weakly ordered processors like ARM. So you should never see it on x64.


>
> Does PoDoFo require C++11 or newer (where there are simple fixes
> available)? Will it ever?
>
>
If podofo cannot depend on C++11 then I would suggest to use
"single"-checked locking (just remove the first check):

//if(!s_pWinAnsiEncoding) // First check
//{
Util::PdfMutexWrapper wrapper( PdfEncodingFactory::s_mutex );

if(!s_pWinAnsiEncoding) // Double check
s_pWinAnsiEncoding = new PdfWinAnsiEncoding();
//}

return s_pWinAnsiEncoding;

Another solution would be to use some additional library that provides
atomic primitives for older C++ but I do not think that it would be worth
it in this case.



>
>
>
> Cheers,
>
> Christopher
>
>
>
> The MathWorks GmbH | Friedlandstr.18 | 52064 Aachen | District Court
> Aachen | HRB 8082 | Managing Directors: Bertrand Dissler, Steven D. Barbo,
> Jeanne O’Keefe
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Question about Stamp annotation and user rotation

2021-04-08 Thread Michal Sudolsky
Hi,

I would try some flags from enum EPdfAnnotationFlags
like ePdfAnnotationFlags_NoRotate, ePdfAnnotationFlags_ReadOnly or
ePdfAnnotationFlags_Locked.

On Thu, Apr 8, 2021 at 10:02 AM Grégory WENTZEL  wrote:

> Hello everybody, I have a dumb question about stampt type annotation.
>
> I wrote a program to add a stamp on each page of a PDF and the user
> can move it anywhere on the page.
> The problem is that the user can rotate the stamp on the page and then
> transform into a draft Icon stamp.
> Is there a way to block the user to rotate annotation ?
>
> First an extract of the code I use to create the annotation in the PDF
> document :
>
> PdfStreamedDocument document(PathFile.c_str());
> PdfFont* pFontNormal = document.CreateFont("Arial", false, false);
> PdfPage* pPage = document.CreatePage(PdfRect(0.0, 0.0, 595.0, 842.0));
> PdfPainter painter;
> PdfRect rect(0.0, 0.0, 40.0, 40.0);
> PdfXObject xObj(rect, );
> painter.SetPage();
> //Some painter stuff, draw circles / lines / text...
> painter.FinishPage();
> PdfAnnotation* pAnnotation =
> pPage->CreateAnnotation(EPdfAnnotation::ePdfAnnotation_Stamp, PdfRect(60.0,
> 600.0, 40.0, 40.0));
> pAnnotation->SetFlags(ePdfAnnotationFlags_Print);
> pAnnotation->SetAppearanceStream();
> document.Close();
>
>
> When I open the PDF with Adobe it looks great, the stamp is present with
> all the stuff draw inside but there is also an anchor to rotate the stamp
> and when I Rotate it all my drawing disappears and it transforms into a
> draft stamp.
>
> I tried to add a key Name with a specific name but the rotation transforms
> it in a crossed square not following the appearance stream content.
>
> Any help would be great !
> Thanks in advance,
> Gregory
>
>
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Problem with pdf file

2021-03-14 Thread Michal Sudolsky
Let's say you have a page "PdfPage *page".

You can get contents:

PdfObject *contents = page->GetObject()->GetIndirectKey("Contents");

Contents now can be PdfDictionary or PdfArray of PdfDictionary if I am not
wrong so you need to handle both situations (and also a situation
where there are no contents at all). All cases which you need to handle can
be read from the pdf reference or you can also check podofo source code
what is doing for example "PdfPainter::SetPage". Now let's assume it is an
array. You can either prepend "q" at beginning of first contents entry or
prepend new entry into array with sole operator "q" (which is my preferred
and simpler way as it does not need to recompress anything):

PdfObject *q = page->GetObject()->GetOwner()->CreateObject();
q->GetStream()->Set("q"); // painter.Save() inserts "q"
contents->GetArray().insert(contents->GetArray().begin(), q->Reference());

Now you can draw into page as usual but before any drawing insert "Q" (but
my preferred way is to append new contents entry for new drawing):

painter.SetPage(page);
painter.Restore(); // this appends "Q"
painter.DrawLine(...

In your case the contents is not array. There is some hint how to make it
array:

PdfArray arr;
arr.push_back(contents->Reference()); // ensure for example that contents
has a valid reference - so it is indirect object as should be like this
"contents->Reference().ObjectNumber() > 0"

page->GetObject()->GetDictionary().AddKey("Contents", arr);
contents = page->GetObject()->GetDictionary().GetKey("Contents"); // if you
need PdfObject to continue or you can directly modify "arr" before adding
it as "Contents" key.

On Sun, Mar 14, 2021 at 6:26 PM Kegin  wrote:

> Thank you for your reply
>
> But ... sorry, I understand what you mean, but I don’t know about the
> structure of pdf.
>
> In MacOS, build the library seem very hard for me, because i unfamiliar
> the cmake,
> I use the podofo-library that it have been built , so... I can't track the
> source code,
>
> Is there an easy way to solve it?
>
> For example, using the podofo-lib to add'q' and'Q' by some function,
> I already try to use the podofo to resave pdf, but it don't changed.
>
> In addition, can you please help me modify the part of the pdf you
> mentioned and send the pdf,
> let me compare and study it?
>
>
>
>
>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Problem with pdf file

2021-03-14 Thread Michal Sudolsky
Problem is with context. Pdf drawing commands are state-based. This is at
start of page:

/GSa gs
/CSp cs
/CSp CS
0.75000 0 0 -0.75000 28.500 813.50 cm

There is no save of state prior to this transformation (cm) so it will
apply also to your commands. You need to prepend "q" before the original
contents and append (pairing) "Q" after it like this:

q
/GSa gs
/CSp cs
/CSp CS
0.75000 0 0 -0.75000 28.500 813.50 cm
...
...
...
Q


On Sun, Mar 14, 2021 at 4:29 PM Kegin  wrote:

> I use wkhtmltopdf to generate pdf,
> and I want to draw a line on the pdf by the library-podofo,
> But this line is not at the exact position
> I manually converted html to pdf, used system printing to save the page as
> pdf
> and draw a line by library-podofo, This line is at the right place
>
> what is the problem? what different to generate pdf between "wkhtmltopdf"
> and "system printing"?
> please help me. Thank you
>
> Command Line
> wkhtmltopdf https://www.google.com google.pdf
>
> My Code
>
> try {
> PdfMemDocument inputDocument;
> PdfMemDocument document;
> PdfPainter painter;
> PdfPage *page;
>
> document.Load("google.pdf");
>
> page = document.GetPage(0);
> painter.SetPage(page);
>
> PdfFont *pFont = document.CreateFont("STHeiti Light", false, new 
> PdfIdentityEncoding(0, 0x, true));
>
> pFont->SetFontSize(12);
> painter.SetFont(pFont);
>
>
> painter.DrawLine(page->GetPageSize().GetWidth() - 1, 
> page->GetPageSize().GetHeight(), page->GetPageSize().GetWidth() - 1, 0 );
> painter.DrawLine(0, 1, page->GetPageSize().GetWidth(), 1 );
>
>
> painter.FinishPage();
> document.Write("output.pdf");
>
> }
> catch (...) {
>
> }
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for png transparency and gray scale

2021-02-25 Thread Michal Sudolsky
Also there is this new warning:

podofo/doc/PdfImage.cpp: In function 'void
PoDoFo::LoadFromPngContent(png_structp, png_infop, PoDoFo::PdfImage*)':
podofo/doc/PdfImage.cpp:943:30: warning: variable 'color' might be
clobbered by 'longjmp' or 'vfork' [-Wclobbered]
  943 | png_byte color;
  |  ^

One must be very cautious about designing code around setjmp and longjmp.


On Tue, Jun 16, 2020 at 2:57 AM Michal Sudolsky  wrote:

> This fixes ticket #90. And for those who use these image loading functions
> note that there are still these leaks described in ticket #89 (not
> introduced by this patch, but it seems that this patch introduces some
> leak, smask will not be deallocated in case some method of PdfImage throws).
>
> On Mon, Jun 15, 2020 at 9:17 AM zyx  wrote:
>
>> On Fri, 2020-06-12 at 11:28 +0200, Christian Sakowski wrote:
>> > this patch is currently not part of the trunk but very important! Can
>> > someone check this in, please?
>>
>> Hi,
>> thanks for the reminder. I missed it in the mailing list.
>>
>> That patch, even the file itself, is a nice mix of different coding
>> styles. Not a big deal, just being noticed.
>>
>> It introduced a new compiler warning here:
>>
>> .../src/podofo/doc/PdfImage.cpp: In function ‘void
>> PoDoFo::LoadFromPngContent(png_structp, png_infop, PoDoFo::PdfImage*)’:
>> .../src/podofo/doc/PdfImage.cpp:934:91: warning: suggest parentheses
>> around ‘&&’ within ‘||’ [-Wparentheses]
>>   934 | color_type == PNG_COLOR_TYPE_PALETTE &&
>> png_get_valid(pPng, pInfo, PNG_INFO_tRNS) && png_get_tRNS(pPng, pInfo,
>> , , NULL))
>>   |
>>  
>> ~~^~
>>
>> which I fixed before committing. I also made the LoadFromPngContent() a
>> static function.
>>
>> I cannot really test this, thus I trust you with this.
>>
>> The patch had been committed as r2011:
>> http://sourceforge.net/p/podofo/code/2011
>>
>> Thanks and bye,
>> zyx
>>
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Helvetica Bold

2021-02-24 Thread Michal Sudolsky
>
>
> > Use PODOFO_HPDF_FONT_HELVETICA_BOLD?
>
> Ok, so there is no automatically selection of this font when using
> CreateFont and bold?
>

It should have been but is not. I also wondered.



> --
>
> Grüße/Regards,
> [heubach-media] | Christian Sakowski
> christian.sakow...@heubach-media.de
> Tel: +49/(0)40/41 455 455
>
>
>
> > Am 24.02.2021 um 14:57 schrieb Joerg Sonnenberger :
> >
> > On Wed, Feb 24, 2021 at 01:12:19PM +0100, Christian Sakowski wrote:
> >> Hi,
> >>
> >> when calling the code:
> >>
> >> result = doc->CreateFont("Helvetica", true, false);
> >>
> >> i got Helvetica Standard and not bold.
> >>
> >> When i look into result, i see, that the flag_isBase14 is not set. But
> it is a base14-font.
> >> Any suggestions?
> >
> > Use PODOFO_HPDF_FONT_HELVETICA_BOLD?
> >
> > Joerg
>
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Helvetica Bold

2021-02-24 Thread Michal Sudolsky
Hi,


> Hi,
>
> when calling the code:
>
> result = doc->CreateFont("Helvetica", true, false);
>
> i got Helvetica Standard and not bold.
>
> When i look into result, i see, that the flag_isBase14 is not set. But it
> is a base14-font.
> Any suggestions?
>

You need to use: CreateFont("Helvetica-Bold", true, false);




> --
>
> Grüße/Regards,
> [heubach-media] | Christian Sakowski
> christian.sakow...@heubach-media.de
> Tel: +49/(0)40/41 455 455
>
>
>
> --
> heubach media
> Osterfeldstr. 12-14 | Haus 1 | Eingang Nord
> 22529 Hamburg
> tel: 040 / 52 10 59 - 10 | fax: -99
> mail: i...@heubach-media.de
> home: www.heubach-media.de
> Geschäftsführer|CEO: Matthias Heubach
>
> Mieten Sie Ihre Computer, iPads & Drucker für Ihre Events bei:
> http://www.milo-rental.com___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for image reading

2021-02-19 Thread Michal Sudolsky
>
>
> I understand what you mean but I still think it is better to have this fix
> than keeping the old piece of code.
>
> I let the administrators decide what to do.
>
>
I am not so sure whether this is a good idea. Actually PDFs with EI without
whitespace may be more prevalent than PDFs which contain false positive "EI
" as part of image data. Imagine PDF generators which strictly follow pdf
reference and do not put such extra whitespace before EI.

I think the correct way would be to follow encoded data (decode until is
length known and then find EI).


> Christophe
>
>
> Le ven. 19 févr. 2021 à 14:48, Michal Sudolsky  a
> écrit :
>
>>
>>>
>>> I've tried to write the EI just after the stream and it does not work in
>>> Acrobat. So, I think the whitespace before EI is really mandatory.
>>>
>>> Please find my example as attachment if you want to verify on your own.
>>> Remove the CR before EI at line 40.
>>>
>>>
>> Your image data are longer than should be. It ends before "7A 7A 7A 0D 0A
>> 45 49" (zzz\r\nEI) and this pdf is not so correct. Pdf viewer must be able
>> to recover from this but it cannot if there would be "zzzEI".
>>
>> Try attached pdf.
>>
>>
>>
>>
>>> Christophe
>>>
>>>
>>> Le ven. 19 févr. 2021 à 12:34, Michal Sudolsky  a
>>> écrit :
>>>
>>>>
>>>> I understand you try to find the corner-case where it will fail again.
>>>>>
>>>>> So, let's just consider for the moment that this code is better than
>>>>> the previous one.
>>>>>
>>>>>
>>>> I think it would be better but it may not parse valid pdf because it
>>>> seems that whitespace before EI is not required.
>>>>
>>>>
>>>>
>>>>> Christophe
>>>>>
>>>>>
>>>>> Le ven. 19 févr. 2021 à 12:13, Michal Sudolsky 
>>>>> a écrit :
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>>> I agree with you but this is a rare case as the PDF generators never
>>>>>>> generate such a sequence.
>>>>>>>
>>>>>>
>>>>>> It is just that the probability for something like " EI " is small
>>>>>> (smaller than "EI" at the end) but it can happen. I doubt that generators
>>>>>> are actively trying to avoid that.
>>>>>>
>>>>>> Also I see nothing in the pdf reference about required whitespace
>>>>>> before EI.
>>>>>>
>>>>>>
>>>>>> Christophe
>>>>>>>
>>>>>>>
>>>>>>> Le ven. 19 févr. 2021 à 10:50, Michal Sudolsky 
>>>>>>> a écrit :
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This should be better than it was. But I have another example:
>>>>>>>>
>>>>>>>> BI /W 4 /H 4 /CS /RGB /BPC 8
>>>>>>>> ID
>>>>>>>> 0z0z00zzz00z0zzz0zz EI aazazaazzzaazazzzazzz
>>>>>>>> EI
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 19, 2021 at 10:28 AM Christophe 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> Please find as attachment a patch to better read inline images.
>>>>>>>>>
>>>>>>>>> The image was read until 'EI>>>>>>>> image stream. It is the case in one of my PDF file:
>>>>>>>>> ===
>>>>>>>>> ...
>>>>>>>>> ID
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> qZ$Tls8Vrqs8)cqqZ$Tls8Vops7u]pq>^Kjs8Vops7u]pq>^Kjs8Vlns7lTnq#:>>>>>>>> ...
>>>>>>>>> EI
>>>>>>>>> ...
>>>>>>>>> ===
>>>>>>>>>
>>>>>>>>> So, the patch looks for "EI".
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christophe
>>>>>>>>> ___
>>>>>>>>> Podofo-users mailing list
>>>>>>>>> Podofo-users@lists.sourceforge.net
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>>>>>>>
>>>>>>>>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for image reading

2021-02-19 Thread Michal Sudolsky
>
>
>
> I've tried to write the EI just after the stream and it does not work in
> Acrobat. So, I think the whitespace before EI is really mandatory.
>
> Please find my example as attachment if you want to verify on your own.
> Remove the CR before EI at line 40.
>
>
Your image data are longer than should be. It ends before "7A 7A 7A 0D 0A
45 49" (zzz\r\nEI) and this pdf is not so correct. Pdf viewer must be able
to recover from this but it cannot if there would be "zzzEI".

Try attached pdf.




> Christophe
>
>
> Le ven. 19 févr. 2021 à 12:34, Michal Sudolsky  a
> écrit :
>
>>
>> I understand you try to find the corner-case where it will fail again.
>>>
>>> So, let's just consider for the moment that this code is better than the
>>> previous one.
>>>
>>>
>> I think it would be better but it may not parse valid pdf because it
>> seems that whitespace before EI is not required.
>>
>>
>>
>>> Christophe
>>>
>>>
>>> Le ven. 19 févr. 2021 à 12:13, Michal Sudolsky  a
>>> écrit :
>>>
>>>> Hi,
>>>>
>>>>
>>>>> I agree with you but this is a rare case as the PDF generators never
>>>>> generate such a sequence.
>>>>>
>>>>
>>>> It is just that the probability for something like " EI " is small
>>>> (smaller than "EI" at the end) but it can happen. I doubt that generators
>>>> are actively trying to avoid that.
>>>>
>>>> Also I see nothing in the pdf reference about required whitespace
>>>> before EI.
>>>>
>>>>
>>>> Christophe
>>>>>
>>>>>
>>>>> Le ven. 19 févr. 2021 à 10:50, Michal Sudolsky 
>>>>> a écrit :
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This should be better than it was. But I have another example:
>>>>>>
>>>>>> BI /W 4 /H 4 /CS /RGB /BPC 8
>>>>>> ID
>>>>>> 0z0z00zzz00z0zzz0zz EI aazazaazzzaazazzzazzz
>>>>>> EI
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 19, 2021 at 10:28 AM Christophe  wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> Please find as attachment a patch to better read inline images.
>>>>>>>
>>>>>>> The image was read until 'EI>>>>>> image stream. It is the case in one of my PDF file:
>>>>>>> ===
>>>>>>> ...
>>>>>>> ID
>>>>>>> ...
>>>>>>>
>>>>>>> qZ$Tls8Vrqs8)cqqZ$Tls8Vops7u]pq>^Kjs8Vops7u]pq>^Kjs8Vlns7lTnq#:>>>>>> ...
>>>>>>> EI
>>>>>>> ...
>>>>>>> ===
>>>>>>>
>>>>>>> So, the patch looks for "EI".
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christophe
>>>>>>> ___
>>>>>>> Podofo-users mailing list
>>>>>>> Podofo-users@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>>>>>
>>>>>>


image.pdf
Description: Adobe PDF document
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for image reading

2021-02-19 Thread Michal Sudolsky
> I understand you try to find the corner-case where it will fail again.
>
> So, let's just consider for the moment that this code is better than the
> previous one.
>
>
I think it would be better but it may not parse valid pdf because it
seems that whitespace before EI is not required.



> Christophe
>
>
> Le ven. 19 févr. 2021 à 12:13, Michal Sudolsky  a
> écrit :
>
>> Hi,
>>
>>
>>> I agree with you but this is a rare case as the PDF generators never
>>> generate such a sequence.
>>>
>>
>> It is just that the probability for something like " EI " is small
>> (smaller than "EI" at the end) but it can happen. I doubt that generators
>> are actively trying to avoid that.
>>
>> Also I see nothing in the pdf reference about required whitespace before
>> EI.
>>
>>
>> Christophe
>>>
>>>
>>> Le ven. 19 févr. 2021 à 10:50, Michal Sudolsky  a
>>> écrit :
>>>
>>>> Hi,
>>>>
>>>> This should be better than it was. But I have another example:
>>>>
>>>> BI /W 4 /H 4 /CS /RGB /BPC 8
>>>> ID
>>>> 0z0z00zzz00z0zzz0zz EI aazazaazzzaazazzzazzz
>>>> EI
>>>>
>>>>
>>>> On Fri, Feb 19, 2021 at 10:28 AM Christophe  wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Please find as attachment a patch to better read inline images.
>>>>>
>>>>> The image was read until 'EI>>>> image stream. It is the case in one of my PDF file:
>>>>> ===
>>>>> ...
>>>>> ID
>>>>> ...
>>>>>
>>>>> qZ$Tls8Vrqs8)cqqZ$Tls8Vops7u]pq>^Kjs8Vops7u]pq>^Kjs8Vlns7lTnq#:>>>> ...
>>>>> EI
>>>>> ...
>>>>> ===
>>>>>
>>>>> So, the patch looks for "EI".
>>>>>
>>>>> Regards,
>>>>> Christophe
>>>>> ___
>>>>> Podofo-users mailing list
>>>>> Podofo-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>>>
>>>>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for image reading

2021-02-19 Thread Michal Sudolsky
Hi,


> I agree with you but this is a rare case as the PDF generators never
> generate such a sequence.
>

It is just that the probability for something like " EI " is small (smaller
than "EI" at the end) but it can happen. I doubt that generators are
actively trying to avoid that.

Also I see nothing in the pdf reference about required whitespace before EI.


Christophe
>
>
> Le ven. 19 févr. 2021 à 10:50, Michal Sudolsky  a
> écrit :
>
>> Hi,
>>
>> This should be better than it was. But I have another example:
>>
>> BI /W 4 /H 4 /CS /RGB /BPC 8
>> ID
>> 0z0z00zzz00z0zzz0zz EI aazazaazzzaazazzzazzz
>> EI
>>
>>
>> On Fri, Feb 19, 2021 at 10:28 AM Christophe  wrote:
>>
>>> Hello all,
>>>
>>> Please find as attachment a patch to better read inline images.
>>>
>>> The image was read until 'EI>> stream. It is the case in one of my PDF file:
>>> ===
>>> ...
>>> ID
>>> ...
>>>
>>> qZ$Tls8Vrqs8)cqqZ$Tls8Vops7u]pq>^Kjs8Vops7u]pq>^Kjs8Vlns7lTnq#:>> ...
>>> EI
>>> ...
>>> ===
>>>
>>> So, the patch looks for "EI".
>>>
>>> Regards,
>>> Christophe
>>> ___
>>> Podofo-users mailing list
>>> Podofo-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>>
>>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for image reading

2021-02-19 Thread Michal Sudolsky
Hi,

This should be better than it was. But I have another example:

BI /W 4 /H 4 /CS /RGB /BPC 8
ID
0z0z00zzz00z0zzz0zz EI aazazaazzzaazazzzazzz
EI


On Fri, Feb 19, 2021 at 10:28 AM Christophe  wrote:

> Hello all,
>
> Please find as attachment a patch to better read inline images.
>
> The image was read until 'EI stream. It is the case in one of my PDF file:
> ===
> ...
> ID
> ...
>
> qZ$Tls8Vrqs8)cqqZ$Tls8Vops7u]pq>^Kjs8Vops7u]pq>^Kjs8Vlns7lTnq#: ...
> EI
> ...
> ===
>
> So, the patch looks for "EI".
>
> Regards,
> Christophe
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Problem with scanned document

2021-01-22 Thread Michal Sudolsky
Hi,

That pdf is not valid according to the pdf reference although would be good
to fix podofo to be able to parse it as other pdf tools do not have
problems. Entries in the xref table use a single character '/n' as the end
of line. It has to be always 2 characters as is mentioned in `CheckEOL`
which fails and podofo is then unable to find a catalog.

On Fri, Jan 22, 2021 at 6:05 PM  wrote:

> Hi,
>
> What is wrong with this scanned document? When it is opened with
> PdfMemDocument (for me in podofoinpose) following error is thrown:
> Unable to create Document: A object was expected but not found.
>
> I first thought the PDF is invalid, but I tested with the validator and
> it seems fine (https://www.pdf-online.com/osa/validate.aspx). Also,
> looking at the internals of the PDF that "root" object seems to be there.
>
> Any ideas why that happens?
>
> Thanks,
> Chris
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] IdentityEncoding issue with Cyrillic alphabet

2020-11-06 Thread Michal Sudolsky
On Fri, Oct 30, 2020 at 8:50 AM Dmytro Tristan 
wrote:

>
>
> Dear Peter and Michal,
>
>
You probably mistakenly sent this only to me so I am returning it to the
mailing list.


> What I did so far:
> I have downloaded and built the latest version of podofo from trunk (as
> far as I'm undestand trunk is the most updated branch);
>

Is it the latest revision r2016 from 2020-10-10? I am suggesting to double
check whether you are really linking with the version of podofo which you
compiled instead of some old version from your system.

Tried draw text with another fonts;
> Result is the same.
>
> Maybe the issue is in my code. Could you please look on some parts of it?
>
> Here is my code for font creation:
>
> m_specialFont = m_doc.CreateFont("Arial Nova Cond-Bold", false, false,
> false,
>
>  
> PdfEncodingFactory::GlobalWinAnsiEncodingInstance(),
>
>  PdfFontCache::eFontCreationFlags_None, 
> true,
>
>  specialFontPath.string().c_str());
>
>
>
> generalFontPath.append(Config::instance().specialTextFontFile());
>
> m_generalFont = m_doc.CreateFont("Arial Nova Cond", false, false, false,
>
>  
> PdfEncodingFactory::GlobalIdentityEncodingInstance(),
>
>  PdfFontCache::eFontCreationFlags_None, 
> true,
>
>  generalFontPath.string().c_str());
>
>
> And I draw text with
>
> painter.DrawMultiLineText(textPosX, textPosY, textWidth, textHeight,
>
>   std::get<1>(m_text).c_str(),//std::wstring 
> here But I have tried with PdfString also
>
>   static_cast(horAlignment), 
> static_cast(verAlignment));
>
>
>
I never tried these wstring APIs so I cannot confirm whether they are
working. Did you try to simply use a normal char string?


> Thank you for your help and time.
>
> Mitia Tristan
>
>
>
> --
> *From:* Peter Bozek 
> *Sent:* Thursday, October 29, 2020 12:31
> *To:* Dmytro Tristan 
> *Cc:* Michal Sudolsky ; podofo-us...@lists.sf.net <
> podofo-us...@lists.sf.net>
> *Subject:* Re: [Podofo-users] IdentityEncoding issue with Cyrillic
> alphabet
>
> On Thu, Oct 29, 2020 at 10:13 AM Dmytro Tristan 
> wrote:
> >
> > I’m not sure that I have explained correctly. I use version 0.9.6.
> Previous ones I have mentioned as examples of issues I have found. Am I
> understand correctly that I should try to CreateFontSubset?
>
> This was a problem of older version - 0,9,4 for sure. When you used
> identity encoding, which you had to when various scripts (cyrilic /
> latin) was used, included font did not contain width table, resulting
> in garbage.
>
> There is a discussion re identity encoding on the list, the problem
> was fixed few years ago, so try to get last version of PoDoFo.
>
> Regards,
>
> Peter Bozek
>
> >
> > Thank you.
> >
> >
> > Best,
> >
> > Mitia
> >
> > Sent from my IPhone
> > 
> > От: Michal Sudolsky 
> > Отправлено: Wednesday, October 28, 2020 10:55:22 PM
> > Кому: Dmytro Tristan 
> > Копия: podofo-us...@lists.sf.net 
> > Тема: Re: [Podofo-users] IdentityEncoding issue with Cyrillic alphabet
> >
> >
> >
> > On Wed, Oct 28, 2020 at 7:15 AM Dmytro Tristan 
> wrote:
> >
> > Hi Guys,
> >
> > I'm waiving an issue with drawing Cyrillic text in pdf document, which i
> create using PdfStreamedDocument class.
> >
> > I searched via mailing list archive and had found couple similar threads.
> > First (
> https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceforge.net%2Fp%2Fpodofo%2Fmailman%2Fmessage%2F36872812%2Fdata=04%7C01%7C%7C16ec916579254cc1880608d87bf5c168%7C84df9e7fe9f640afb435%7C1%7C0%7C637395642718051123%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=l0wsCEfZRTZlvf32RSv5ucfy4fuzunmLAUoy0fvfAzA%3Dreserved=0)
> propose to build and install version from trunk, what I had tried, but i
> think solution is outdated as trunk was updated since then.
> >
> >
> > No, it is not outdated. For a truetype font subset is needed to be used
> CreateFontSubset with some newer podofo in which it is working. I
> personally doubt that anyone will try to fix old versions of podofo.
> Looking at your pdf the problem is probably due to missing some pdf font
> structures compared to a pdf generated by the latest p

Re: [Podofo-users] IdentityEncoding issue with Cyrillic alphabet

2020-10-28 Thread Michal Sudolsky
On Wed, Oct 28, 2020 at 11:00 PM Dmytro Tristan 
wrote:

> I’m not sure that I have explained correctly. I use version 0.9.6.
> Previous ones I have mentioned as examples of issues I have found. Am I
> understand correctly that I should try to CreateFontSubset?
>
>
I consider 0.9.6 as an old version. I noticed that fonts in your pdf are
both truetype font subsets. I am not sure how you are creating fonts but
seems that CreateFont cannot be used to do this in version of podofo which
I am using. Although there is option eFontCreationFlags_Type1Subsetting I
suppose it is intended to work only for type1 fonts. So if you
have CreateFontSubset in your version you can try it (or newer podofo).



Thank you.
>
>
> Best,
>
> Mitia
>
> Sent from my IPhone
> ------
> *От:* Michal Sudolsky 
> *Отправлено:* Wednesday, October 28, 2020 10:55:22 PM
> *Кому:* Dmytro Tristan 
> *Копия:* podofo-us...@lists.sf.net 
> *Тема:* Re: [Podofo-users] IdentityEncoding issue with Cyrillic alphabet
>
>
>
> On Wed, Oct 28, 2020 at 7:15 AM Dmytro Tristan 
> wrote:
>
> Hi Guys,
>
> I'm waiving an issue with drawing Cyrillic text in pdf document, which i
> create using PdfStreamedDocument class.
>
> I searched via mailing list archive and had found couple similar threads.
> First (https://sourceforge.net/p/podofo/mailman/message/36872812/
> <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceforge.net%2Fp%2Fpodofo%2Fmailman%2Fmessage%2F36872812%2F=04%7C01%7C%7C014ea77d982b49b4e71708d87b83d0ae%7C84df9e7fe9f640afb435%7C1%7C0%7C637395153347803086%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=VjfHB4rT8%2BR5vsHFpmEWF2yjl4dYRPfZ5FRtBzZGUng%3D=0>)
> propose to build and install version from trunk, what I had tried, but i
> think solution is outdated as trunk was updated since then.
>
>
> No, it is not outdated. For a truetype font subset is needed to be used
> CreateFontSubset with some newer podofo in which it is working. I
> personally doubt that anyone will try to fix old versions of
> podofo. Looking at your pdf the problem is probably due to missing some pdf
> font structures compared to a pdf generated by the latest podofo and font
> is probably treated as if it has single-byte encoding and effect of this
> are these mentioned extra spaces.
>
> Second (https://sourceforge.net/p/podofo/mailman/message/36238386/
> <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceforge.net%2Fp%2Fpodofo%2Fmailman%2Fmessage%2F36238386%2F=04%7C01%7C%7C014ea77d982b49b4e71708d87b83d0ae%7C84df9e7fe9f640afb435%7C1%7C0%7C637395153347803086%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=zPo7V755Ut28CgZ9nxlE799GbVg0iF5k7rNngH12VK8%3D=0>)
> basically shows that issue starts from 0.9.4 version. I had tested that
> with 0.9.3 and can confirm that in last one issue is not reproduced.
>
> The problem is that when I create font with Identity encoding the text
> prints with extra space between characters, the same as in mentioned
> threads. Like [l i k e] (no matter Cyrillic or Latin characters). I enclose
> example of PDF. In this PDf I use two fonts - Arial Nova Cond and Arial
> Nova Cond-Bold. First created with Identity encoding, the second with
> WinAnsiEncoding.
>
>
> I have tried different fonts with different encoding, the Cyrillic
> character either is not printed at all, either all text is printed with
> spaces;
>
> The obvious way is to check what was changed when upgrading to 0.9.4 and
> had did that, but there are a lot of changes and I do not know even where
> to start digging.
>
> Will appreciated any help with this issue.
>
> Thank you.
>
> Mitia Tristan
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
> <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fpodofo-users=04%7C01%7C%7C014ea77d982b49b4e71708d87b83d0ae%7C84df9e7fe9f640afb435%7C1%7C0%7C637395153347813081%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=LomokKT1oQZ%2FCMyIULyp4IevalpIF9MPVV03Rr%2FlkBY%3D=0>
>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] IdentityEncoding issue with Cyrillic alphabet

2020-10-28 Thread Michal Sudolsky
On Wed, Oct 28, 2020 at 7:15 AM Dmytro Tristan 
wrote:

> Hi Guys,
>
> I'm waiving an issue with drawing Cyrillic text in pdf document, which i
> create using PdfStreamedDocument class.
>
> I searched via mailing list archive and had found couple similar threads.
> First (https://sourceforge.net/p/podofo/mailman/message/36872812/)
> propose to build and install version from trunk, what I had tried, but i
> think solution is outdated as trunk was updated since then.
>

No, it is not outdated. For a truetype font subset is needed to be used
CreateFontSubset with some newer podofo in which it is working. I
personally doubt that anyone will try to fix old versions of
podofo. Looking at your pdf the problem is probably due to missing some pdf
font structures compared to a pdf generated by the latest podofo and font
is probably treated as if it has single-byte encoding and effect of this
are these mentioned extra spaces.

Second (https://sourceforge.net/p/podofo/mailman/message/36238386/)
> basically shows that issue starts from 0.9.4 version. I had tested that
> with 0.9.3 and can confirm that in last one issue is not reproduced.
>
> The problem is that when I create font with Identity encoding the text
> prints with extra space between characters, the same as in mentioned
> threads. Like [l i k e] (no matter Cyrillic or Latin characters). I enclose
> example of PDF. In this PDf I use two fonts - Arial Nova Cond and Arial
> Nova Cond-Bold. First created with Identity encoding, the second with
> WinAnsiEncoding.
>
>
> I have tried different fonts with different encoding, the Cyrillic
> character either is not printed at all, either all text is printed with
> spaces;
>
> The obvious way is to check what was changed when upgrading to 0.9.4 and
> had did that, but there are a lot of changes and I do not know even where
> to start digging.
>
> Will appreciated any help with this issue.
>
> Thank you.
>
> Mitia Tristan
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] CreateFont API returns NULL for few fonts

2020-08-19 Thread Michal Sudolsky
On Mon, Aug 17, 2020 at 9:20 PM Anand Ramasamy  wrote:

> Dear All,
>
>
>
> Thank you for help in advance.
>
>
>
> We are using "CreateFont" API to create PdfFont object(as given in below
> code) and use that font object to set the Font.
>
>
>
> ...
>
> PdfFont *font = NULL;
>
> font = doc->CreateFont("Arial Unicode MS", bold, italic, false, encoding);
>
> ...
>
>
>
> For "Arial Unicode MS" font, "CreateFont" API returns NULL in Windows 10
> 32-bit environment.
>
> Same API works fine in Windows 10 64-bit environment.
>
> Please note that same behavior is observed with “Arabic TypeSetting” font
> also.
>
>
>
> We have also confirmed that "Arial Unicode MS" font is installed in the PC.
>
> Please let us know whether this is a known issue in Podofo library or
> should we use any other API to create PDFFont object.
>
> PODOFO Library Version used is: 0.9.5.3
>
>
I would recommend to use the latest version directly from SVN (not 0.9.6).


>
> Also is there any other method which we can use to check whether a Font
> supports a particular language or not.
>
> For example, Japanese, Chinese characters are not supported by "Arial"
> font.
>
> Is there any API or method to confirm whether a Font supports the given
> string.
>
>
>From PdfFont can be obtained PdfFontMetrics. If it is font obtained using
Freetype api it can be casted to PdfFontMetricsFreetype from which can be
obtained FT_Face and used Freetype api to determine what encodings and
languages that font supports. Or should be possible to find the right font
name using Fontconfig api.



>
>
> Once again thanks in advance for the help.
>
>
>
> Regards,
>
> Anand.
> ::DISCLAIMER::
> --
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only. E-mail transmission is not
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or may contain
> viruses in transmission. The e mail and its contents (with or without
> referred errors) shall therefore not attach any liability on the originator
> or HCL or its affiliates. Views or opinions, if any, presented in this
> email are solely those of the author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification, distribution and / or
> publication of this message without the prior written consent of authorized
> representative of HCL is strictly prohibited. If you have received this
> email in error please delete it and notify the sender immediately. Before
> opening any email and/or attachments, please check them for viruses and
> other defects.
> --
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] Base14 fonts ascent and descent values

2020-08-18 Thread Michal Sudolsky
Hi,

Why were changed ascent and descent values of all base14 fonts in commit
from 16.09.2018 "Enhance base14 fonts' underline and strikeout position and
thickness ...thus they better match their counterpart's appearance."? Maybe
it was unintended as it is not mentioned in the commit message?

Now these descent values are higher than bounding box of these fonts which
should not happen. And original values were what is officially provided by
adobe:

https://www.adobe.com/devnet/font.html

"Font Metrics for PDF Core 14 Fonts"
http://download.macromedia.com/pub/developer/opentype/tech-notes/Core14_AFMs.zip

For example Helvetica.afm:

Ascender 718
Descender -207

I suggest that they should be changed back.

Btw there is also kerning info about these fonts.
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] podofo memleak broken pdf

2020-08-17 Thread Michal Sudolsky
> b) the middle chunk doesn't look related;
>

If you meant this then ".get()" is needed to obtain a pointer to the
managed object:

-if( (pBuffer - pStart) >= lBufferLen )
+if( (pBuffer - pStart.get()) >= lBufferLen )

d) what is your real name, please? It'll be used in credits for
>the change.
>
>
Sorry I cannot resist but I am curious what is your name zyx?


>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Any plans to relocate to github?

2020-08-17 Thread Michal Sudolsky
>
> The integration of
> svn and tickets is also a plus
>
>
On both github and gitlab there is even better integration between tickets
(called issues) and git. For example accepting and merging a pull request
can close the respective ticket automatically.


>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for new tool to remove watermarks

2020-08-13 Thread Michal Sudolsky
Hi,


> Hi,
> I'm not sure I'd "approve" such tool. The watermarks have their
> intention/usage, removing them might "break" the intention. I'm not the
> decision maker here, I'm just saying my personal opinion.
>

I do not intend to express any opinion here. I am just curious so who is
now the decision maker for podofo?

Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Any plans to relocate to github?

2020-08-12 Thread Michal Sudolsky
Hi,

I totally agree with that.

Actually I have a local git fork of podofo using the git-svn tool. It is
easier for me to do both simple and complex changes on that using branches,
staging and other git features. Then I can export these changes as git
patches. Then to simplify things on the upstream side I reapply these
patches on the svn repository and export as svn patches. When they are
applied to upstream I would do git rebase to synchronise it. This works
seamlessly in most cases except in a few cases where the same change on
upstream was little different from my local change. I think moving to git
repo would greatly simplify things not only for me but also for maintainer
and (potential) contributors. With github (or other git-based system) a
contributor can directly create pull requests from changes made on his/her
local copy by single click. The maintainer can then review changes and
incorporate them by single click instead of manipulating with patch files.
It is just about simplifying things. I doubt that there is something
similar and such easy for svn. And branches in git are much more
lightweight, they are less or more just pointers to specific commits
instead of full copy of files on svn where the process of merging is
probably not so easy (I guess this is why we are sending patch files
instead of creating branches on svn repo?).

On Wed, Aug 12, 2020 at 2:25 PM Thomas Barnekov <
thomas.barne...@formpipe.com> wrote:

> Hi,
> In my opinion, a move to GitHub or any other Git-based system could
> benefit the PoDoFo project in terms of contributions.
>
> Subversion makes it very hard to maintain a fork with your own patches,
> and the current process for getting a patch into PoDoFo seems to rely a lot
> on manual labor. I'm afraid the current process (and Subversion) poses a
> barrier for some people when it comes to contributing to PoDoFo.
> On top of that, it seems that patches sent to the mailing list are
> sometimes overlooked or forgotten, which is not exactly a motivating factor
> for contributors.
>
> The strength of GitHub is that it's easy to create your own fork where you
> can commit your changes. Once you're ready, you send a pull request to the
> main project to get your changes incorporated/accepted.
> Until your PR is accepted, it's easy to keep your fork up-to-date with the
> main project, so you don't have to wait for your patch to be accepted
> before you can use it yourself.
>
> For the maintainer, reviewing and accepting a contribution (in the shape
> of a pull request) is done with a few clicks in a browser. This could help
> speed up the process of getting submitted patches into the main PoDoFo repo.
>
> GitHub is pretty much the standard for hosting open-source projects, so
> basically every future potential contributor will already be familiar with
> the process of forking and sending pull requests, but there are plenty of
> alternatives that provide pretty much the same advantages.
> It's basically all about making it as easy as possible for people to
> contribute.
>
> Personally I've only contributed a single patch to PoDoFo, because the bug
> was a show stopper for me. I don't bother doing minor contributions because
> the it's basically too troublesome to contribute, compared to the work that
> would actually go into the code itself.
>
> But that's just my point of view.
>
> Best Regards,
> Thomas Barnekov
>
> -Original Message-
> From: zyx 
> Sent: 12. august 2020 09:32
> To: podofo-users@lists.sourceforge.net
> Subject: Re: [Podofo-users] Any plans to relocate to github?
>
> On Wed, 2020-08-12 at 10:01 +0500, Ivan Romanov via Podofo-users wrote:
> > My question in title.
>
> Hi,
> I hope not. I do not think the move itself would make the project more
> attractive to the contributors. Though that would be nice to see (more
> contributors, not the move to GitHub).
>
> Just my personal opinion, I'm not the decision maker here.
> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] bug in podofo standard encoding vector

2020-06-16 Thread Michal Sudolsky
>
>
> you could start with counting the number of entries in the s_cEncoding
> array.
>
> It sums up to 255 instead of 256. The indices are in the comments in hex.
> 0xCC is missing!
>

Yes I noticed that there is one entry missing. This is here from 2010 when
these encodings were added.


> Otherwise you could save a pdf with a subsetted type1 font and use the
> 'í'. Then look with a debugger at line 626 of PdfFontType1.cpp to see it
> can not find the dotlessi or any other character defined after index 0xCC.
>
What do you mean by "subsetted"? Is this something with the function
"CreateFontSubset"?

Here is test case:

PdfMemDocument doc;
PdfPainter p;
p.SetPage(doc.CreatePage({0, 0, 500, 800}));
p.SetFont(doc.CreateFont("Helvetica", false, false, false,
PdfEncodingFactory::GlobalStandardEncodingInstance()));
p.DrawText(0, 700, (pdf_utf8*)"a\u0131o");
p.FinishPage();
doc.Write("test.pdf");

After patch "dotlessi" can be seen in pdf (<61F56F> Tj) but not without it.
Seems this is problem only when using "StandardEncoding".

Btw character which you sent does not look like "dotlessi" on my side
(attachment some_i.png):
[image: some_i.png]

This is how it looks in PDF after was standard encoding patched
(dotlessi.png):
[image: dotlessi.png]

Svn patch attached.


> On 16-6-2020 03:24, Michal Sudolsky wrote:
>
> Hi,
>
> Can you please let me know how exactly I can test this?
>
> On Mon, Jun 1, 2020 at 3:28 PM Ferdinand Oeinck  wrote:
>
>> Hi,
>>
>> I'm using the podofo source code since some years.
>>
>> In my own copy I fixed this bug in PdfEncoding.cpp in 2013.
>>
>> Recently I've update my source code to 0.9.6 and found the bug is still
>> present on sourceforge.
>>
>> Maybe you could fix it in the main repository? Thanks in advance!
>>
>> Ferdinand Oeinck,
>> Big Roses Software.
>>
>> Please look at lines 1605 and 1606 of PdfEncoding.cpp:
>>
>> 0x00B8, // CB # CEDILLA # cedilla
>> 0x02DD, // CD # DOUBLE ACUTE ACCENT # hungarumlaut
>>
>> I think there is one line missing:
>> 0x00B8, // CB # CEDILLA # cedilla
>> /*--> missing*/ 0x, // CC undefined
>> 0x02DD, // CD # DOUBLE ACUTE ACCENT # hungarumlaut
>>
>> When I include this line, subsetted type1 fonts using 'í' will find
>> /dotlessi otherwise they would find /.notdef
>>
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>


standard_encoding_missing_CC.patch
Description: Binary data
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] bug in podofo standard encoding vector

2020-06-15 Thread Michal Sudolsky
Hi,

Can you please let me know how exactly I can test this?

On Mon, Jun 1, 2020 at 3:28 PM Ferdinand Oeinck  wrote:

> Hi,
>
> I'm using the podofo source code since some years.
>
> In my own copy I fixed this bug in PdfEncoding.cpp in 2013.
>
> Recently I've update my source code to 0.9.6 and found the bug is still
> present on sourceforge.
>
> Maybe you could fix it in the main repository? Thanks in advance!
>
> Ferdinand Oeinck,
> Big Roses Software.
>
> Please look at lines 1605 and 1606 of PdfEncoding.cpp:
>
> 0x00B8, // CB # CEDILLA # cedilla
> 0x02DD, // CD # DOUBLE ACUTE ACCENT # hungarumlaut
>
> I think there is one line missing:
> 0x00B8, // CB # CEDILLA # cedilla
> /*--> missing*/ 0x, // CC undefined
> 0x02DD, // CD # DOUBLE ACUTE ACCENT # hungarumlaut
>
> When I include this line, subsetted type1 fonts using 'í' will find
> /dotlessi otherwise they would find /.notdef
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Patch for png transparency and gray scale

2020-06-15 Thread Michal Sudolsky
This fixes ticket #90. And for those who use these image loading functions
note that there are still these leaks described in ticket #89 (not
introduced by this patch, but it seems that this patch introduces some
leak, smask will not be deallocated in case some method of PdfImage throws).

On Mon, Jun 15, 2020 at 9:17 AM zyx  wrote:

> On Fri, 2020-06-12 at 11:28 +0200, Christian Sakowski wrote:
> > this patch is currently not part of the trunk but very important! Can
> > someone check this in, please?
>
> Hi,
> thanks for the reminder. I missed it in the mailing list.
>
> That patch, even the file itself, is a nice mix of different coding
> styles. Not a big deal, just being noticed.
>
> It introduced a new compiler warning here:
>
> .../src/podofo/doc/PdfImage.cpp: In function ‘void
> PoDoFo::LoadFromPngContent(png_structp, png_infop, PoDoFo::PdfImage*)’:
> .../src/podofo/doc/PdfImage.cpp:934:91: warning: suggest parentheses
> around ‘&&’ within ‘||’ [-Wparentheses]
>   934 | color_type == PNG_COLOR_TYPE_PALETTE &&
> png_get_valid(pPng, pInfo, PNG_INFO_tRNS) && png_get_tRNS(pPng, pInfo,
> , , NULL))
>   |
>  
> ~~^~
>
> which I fixed before committing. I also made the LoadFromPngContent() a
> static function.
>
> I cannot really test this, thus I trust you with this.
>
> The patch had been committed as r2011:
> http://sourceforge.net/p/podofo/code/2011
>
> Thanks and bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Linking podofo with xcode project

2020-05-26 Thread Michal Sudolsky
Hi,

Hi,
> being it it, I'd expect a lot more format-specific warnings, not only
> this one. It's only my opinion, no proof available. I just tried with
> clang 10.0.0 on Linux and it doesn't claim this warning. It claims some
> other, which gcc (10.1.1) does not claim about here.
>
>
On which version of linux did you try?

What is interesting if I force it to use these defines:

#  define PDF_FORMAT_INT64 "lld"
#  define PDF_FORMAT_UINT64 "llu"
#  define PDF_SIZE_FORMAT "zu"

Then there is no -Wformat warning on osx. But there are -Wformat warnings
on linux (one with llOffset and one with llGeneration):

podofo/base/PdfParser.cpp:908:54: warning: format ‘%lld’ expects argument
of type ‘long long int*’, but argument 3 has type ‘PoDoFo::pdf_int64*’ {aka
‘long int*’} [-Wformat=]
  908 | int read = sscanf( m_buffer.GetBuffer(), "%10"
PDF_FORMAT_INT64 " %5" PDF_FORMAT_INT64 " %c%c%c",
  909 |   , , ,
,  );

Are there any problems using macro like "SCNd64" from cinttypes?

Why is PdfXRefStreamParserObject deprecated when it is so essential?



> > podofo/base/PdfDate.cpp:205:50: warning: ‘%s’ directive output may be
> > truncated writing up to 5 bytes into a region of size between 1 and
> > 26 [-Wformat-truncation=]
> >   205 | snprintf( m_szDate, PDF_DATE_BUFFER_SIZE, "%s%s'00'",
> > szDate, szZone );
>
> This one is not claimed by clang here.
>
> I use the same warning flags for both compilers.
>

On ubuntu 20.04 it is and on gcc. Also now I see also that warning about
lua (without enabled lua):

podofo/src/podofo/base/PdfDate.cpp:204:50: warning: ‘%s’ directive output
may be truncated writing up to 5 bytes into a region of size between 1 and
26 [-Wformat-truncation=]
  204 | snprintf( m_szDate, PDF_DATE_BUFFER_SIZE, "%s%s'00'", szDate,
szZone );

podofo/src/podofo/base/PdfMemStream.cpp:57:18: warning: implicitly-declared
‘PoDoFo::PdfMemStream& PoDoFo::PdfMemStream::operator=(const
PoDoFo::PdfMemStream&)’ is deprecated [-Wdeprecated-copy]
   57 | operator=(rhs);

podofo/src/podofo/base/PdfString.cpp:787:58: warning: unused parameter
‘eConversion’ [-Wunused-parameter]
  787 | EPdfStringConversion
eConversion )

podofo/src/podofo/base/PdfString.cpp:825:58: warning: unused parameter
‘eConversion’ [-Wunused-parameter]
  825 | EPdfStringConversion
eConversion  )

podofo/src/podofo/base/PdfTokenizer.cpp:440:75: warning: type qualifiers
ignored on cast result type [-Wignored-qualifiers]
  440 | else if( !(isdigit( static_cast(*pszStart) ) || *pszStart == '-' || *pszStart == '+' ) )

podofo/src/podofo/base/PdfTokenizer.cpp:512:73: warning: type qualifiers
ignored on cast result type [-Wignored-qualifiers]
  512 |  static_cast(l) );

podofo/src/podofo/doc/PdfField.cpp:107:26: warning: implicitly-declared
‘constexpr PoDoFo::PdfField& PoDoFo::PdfField::operator=(const
PoDoFo::PdfField&)’ is deprecated [-Wdeprecated-copy]
  107 | this->operator=( rhs );

podofo/tools/podofocolor/podofocolor.cpp:50:89: warning: unused parameter
‘lua’ [-Wunused-parameter]
   50 | static IConverter* ConverterForName( const std::string & converter,
const std::string & lua )

These when I used clang (still on linux):

podofo/src/podofo/base/PdfParser.cpp:946:5: warning:
'PdfXRefStreamParserObject' is deprecated [-Wdeprecated-declarations]
PdfXRefStreamParserObject xrefObject( m_vecObjects, m_device, m_buffer,
_offsets );

podofo/test/VariantTest/VariantTest.cpp:131:39: warning: illegal character
encoding in string literal [-Winvalid-source-encoding]
TEST_SAFE_OP( Test( "(Hallo \\(schne\\) Welt!)",
ePdfDataType_String ) );

podofo/test/ObjectParserTest/ObjectParserTest.cpp:322:61: warning: illegal
character encoding in string literal [-Winvalid-source-encoding]
const char* pszSimpleObjectString2 = "5 0 obj\n(Hallo \\(schne\\)
Welt!)\nendobj\n";

podofo/test/ObjectParserTest/ObjectParserTest.cpp:448:77: warning: illegal
character encoding in string literal [-Winvalid-source-encoding]
TRY_TEST(TestObject_String( pszSimpleObjectString2, 5, 0, "(Hallo
\\(schne\\) Welt!)" );)



Configuring it with cmake on osx is little harder. I need to manualy set
FREETYPE_INCLUDE_DIR then:

-- Performing Test PODOFO_HAVE_OPENSSL_1_1
-- Performing Test PODOFO_HAVE_OPENSSL_1_1 - Failed

Due to this it will not compile:

src/podofo/base/PdfEncrypt.cpp:143:24: error: field has incomplete type
'EVP_CIPHER_CTX' (aka 'evp_cipher_ctx_st')
EVP_CIPHER_CTX aes;

until I manually enable openssl 1.1 in podofo_config.h:

#define PODOFO_HAVE_OPENSSL_1_1

Then there is this error (due to latest commit):

src/podofo/base/PdfString.cpp:136:9: error: unknown type name 'constexpr'
constexpr bool wchar_t_is_two_bytes = sizeof(wchar_t) == 2;

I need to enable c++11. And there are many warnings -Wformat in tools and
tests. Including similar warnings as on linux with 

Re: [Podofo-users] Linking podofo with xcode project

2020-05-25 Thread Michal Sudolsky
Regarding : https://en.cppreference.com/w/cpp/types/integer

On Mon, May 25, 2020 at 11:58 PM Michal Sudolsky 
wrote:

>
>> Hi,
>> this is long time ago. I checked the code above (at r2008) and it looks
>> like this:
>>
>>   int read = sscanf( m_buffer.GetBuffer(), "%10" PDF_FORMAT_INT64 " %5"
>> PDF_FORMAT_INT64 " %c%c%c",
>> , , , ,  );
>>
>> which makes me think that PDF_FORMAT_INT64 is not properly set on OSX,
>> or the sscanf() cannot decipher it properly. Both feels unlikely, but I
>> do not have any environment to test it with.
>>
>>
> Why not use ? There should be proper macros for
> printing/scanning int64_t.
>
> Here size of both long (SZ_LONG) and int64_t (SZ_INT64) is 8 so "ld" is
> used:
>
> #elif defined(SZ_INT64) && defined(SZ_LONG) && SZ_INT64 == SZ_LONG
> #  define PDF_FORMAT_INT64 "ld"
> #  define PDF_FORMAT_UINT64 "lu"
> #  define PDF_SIZE_FORMAT "zu"
>
> Also size of "long long" is 8. I suppose on linux are used these same
> defines.
>
> Probably only compiler on osx have problem with this which is clang
> instead of gcc on linux.
>
> On linux there is also this new strange warning (except usual deprecated
> declarations):
>
> podofo/base/PdfDate.cpp:205:50: warning: ‘%s’ directive output may be
> truncated writing up to 5 bytes into a region of size between 1 and 26
> [-Wformat-truncation=]
>   205 | snprintf( m_szDate, PDF_DATE_BUFFER_SIZE, "%s%s'00'", szDate,
> szZone );
>
> If I am not wrong there are many other warnings under some circumstances
> (do not remember now exactly) like these:
> -Wunused-private-field
> -Wdeprecated-copy
> -Wdeprecated-declarations
> -Wignored-qualifiers
> -Wunused-parameter
> -Wstringop-truncation
>
> I can send more details about them.
>
> I do not see this one too. Do you build with LUA? The code in question
>> looks properly, from my point of view:
>>
>> #ifdef PODOFO_HAVE_LUA
>> else if( converter == "lua" )
>> {
>> pConverter = new LuaConverter( lua );
>> }
>> #else
>> PODOFO_UNUSED_PARAM( lua )
>> #endif //  PODOFO_HAVE_LUA
>>
>> I do see unused parameter warnings from other functions, thus it is
>> enabled here.
>>
>
> No. Maybe this was already resolved or this happens under above mentioned
> "circumstances".
>
> Bye,
>> zyx
>>
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Linking podofo with xcode project

2020-05-25 Thread Michal Sudolsky
>
>
> Hi,
> this is long time ago. I checked the code above (at r2008) and it looks
> like this:
>
>   int read = sscanf( m_buffer.GetBuffer(), "%10" PDF_FORMAT_INT64 " %5"
> PDF_FORMAT_INT64 " %c%c%c",
> , , , ,  );
>
> which makes me think that PDF_FORMAT_INT64 is not properly set on OSX,
> or the sscanf() cannot decipher it properly. Both feels unlikely, but I
> do not have any environment to test it with.
>
>
Why not use ? There should be proper macros for
printing/scanning int64_t.

Here size of both long (SZ_LONG) and int64_t (SZ_INT64) is 8 so "ld" is
used:

#elif defined(SZ_INT64) && defined(SZ_LONG) && SZ_INT64 == SZ_LONG
#  define PDF_FORMAT_INT64 "ld"
#  define PDF_FORMAT_UINT64 "lu"
#  define PDF_SIZE_FORMAT "zu"

Also size of "long long" is 8. I suppose on linux are used these same
defines.

Probably only compiler on osx have problem with this which is clang instead
of gcc on linux.

On linux there is also this new strange warning (except usual deprecated
declarations):

podofo/base/PdfDate.cpp:205:50: warning: ‘%s’ directive output may be
truncated writing up to 5 bytes into a region of size between 1 and 26
[-Wformat-truncation=]
  205 | snprintf( m_szDate, PDF_DATE_BUFFER_SIZE, "%s%s'00'", szDate,
szZone );

If I am not wrong there are many other warnings under some circumstances
(do not remember now exactly) like these:
-Wunused-private-field
-Wdeprecated-copy
-Wdeprecated-declarations
-Wignored-qualifiers
-Wunused-parameter
-Wstringop-truncation

I can send more details about them.

I do not see this one too. Do you build with LUA? The code in question
> looks properly, from my point of view:
>
> #ifdef PODOFO_HAVE_LUA
> else if( converter == "lua" )
> {
> pConverter = new LuaConverter( lua );
> }
> #else
> PODOFO_UNUSED_PARAM( lua )
> #endif //  PODOFO_HAVE_LUA
>
> I do see unused parameter warnings from other functions, thus it is
> enabled here.
>

No. Maybe this was already resolved or this happens under above mentioned
"circumstances".

Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Acroform fields array is not updated when it is reference

2020-05-25 Thread Michal Sudolsky
Hi,

I even cannot set myself as owner when creating ticket.

I can tell about my own tickets. These are fixed based on recent commits:
80, 81, 82, 88



On Mon, May 25, 2020 at 4:24 PM zyx  wrote:

> On Mon, 2020-04-20 at 18:57 +0200, Michal Sudolsky wrote:
> > Also seems I cannot edit or close own tickets on sourceforge. So feel
> > free to close ones that are already committed and resolved.
>
> Hi,
> that's weird, as it works fine here. No idea whether it's about higher
> privileges or not, I might have just commit rights.
>
> Anyway, by any chance, do you have a list of those, please? I tried to
> clean up the tickets, but then I realized there's a big mess in them,
> then I just gave up. I'm sorry.
> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] Use GetStream() const where possible to avoid marking objects dirty unnecessarily

2020-04-21 Thread Michal Sudolsky
Hi,

Details and patch are in ticket https://sourceforge.net/p/podofo/tickets/91/
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Loading pdf encrypted with password throws exception

2020-04-20 Thread Michal Sudolsky
>
>
> Hi,
> okay, that sounds correct. I'm not sure whether your mail is complete,
> it seems to be cut for some reason.
>
> Having there a way to know whether password is needed is required, no
> doubt. I would not drop the current call-flow though, as many library
> users rely on it.
>

Hi,

To me it seems that you got whole email.

Current call-flow can coexist with better approach and be used by default
of course. For example there can be function for password detection (but
not only filename based) and function to set password before loading. And
during load it will not throw when was password already set. Or there can
be function like SetPasswordCallback and during loading using some variant
of Load function it will call user supplied callback (instead of throw)
which should provide password. If password or callback was not set then
default behaviour would be to throw as now.



> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Acroform fields array is not updated when it is reference

2020-04-20 Thread Michal Sudolsky
Hi,

Oh I forgot to send here updated version of this patch. It is available
here: https://sourceforge.net/p/podofo/tickets/79/#a549
<https://sourceforge.net/p/podofo/tickets/79/>

Also seems I cannot edit or close own tickets on sourceforge. So feel free
to close ones that are already committed and resolved.


On Fri, Mar 27, 2020 at 6:07 PM zyx  wrote:

> On Mon, 2020-03-02 at 21:32 +0100, Michal Sudolsky wrote:
> > Patch attached.
>
> Hi,
> thanks for the patch, I committed it as r2002:
> http://sourceforge.net/p/podofo/code/2002
>
> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Memory leak in PdfImage::LoadFromPngData and PdfImage::LoadFromPngHandle

2020-04-20 Thread Michal Sudolsky
Hi,

Similar memory leak is in tiff loading function. I am sending test cases as
files for both tiff and png.

This is because buffer will not be deleted in case of exception between new
and delete:

*char* *buffer = *new* *char*[bufferSize];

*if*( !buffer )

{

TIFFClose(hInTiffHandle);

PODOFO_RAISE_ERROR( ePdfError_OutOfMemory );

}



*for*(row = 0; row < height; row++)

{

*if*(TIFFReadScanline(hInTiffHandle,

[row * scanlineSize],

row) == (-1))

{

TIFFClose(hInTiffHandle);

PODOFO_RAISE_ERROR( ePdfError_UnsupportedImageFormat );

}

}



PdfMemoryInputStream stream(buffer, bufferSize);



SetImageData(*static_cast*<*unsigned* *int*>(width),

 *static_cast*<*unsigned* *int*>(height),

 *static_cast*<*unsigned* *int*>(bitsPerSample),

 );



*delete*[] buffer;

Here is output from valgrind and clang memory sanitizer:

==17391== 8,192 bytes in 1 blocks are definitely lost in loss record 1,202
of 1,257
==17391==at 0x4C3089F: operator new[](unsigned long) (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17391==by 0x970753: PoDoFo::PdfImage::LoadFromTiffHandle(void*)
(PdfImage.cpp:638)
==17391==by 0x9711ED: PoDoFo::PdfImage::LoadFromTiffData(unsigned char
const*, long) (PdfImage.cpp:838)

==7771==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 8192 byte(s) in 1 object(s) allocated from:
#0 0x55f796949e05 in operator new[](unsigned long)
/checkout/src/llvm-project/compiler-rt/lib/lsan/lsan_interceptors.cc:231:37
#1 0x55f796eb7ec5 in PoDoFo::PdfImage::LoadFromTiffHandle(void*)
podofo/doc/PdfImage.cpp:638
#2 0x55f796eb895f in PoDoFo::PdfImage::LoadFromTiffData(unsigned char
const*, long) podofo/doc/PdfImage.cpp:838

On Tue, Nov 5, 2019 at 8:17 PM Michal Sudolsky  wrote:

> If function png_read_image encounters error it will call longjmp and will
> jump to place where was called setjmp. In such case pBuffer and pRows will
> never be deallocated.
>
> Cut of code from mentioned functions:
> ```
> // Read the file
> if( setjmp(png_jmpbuf(pPng)) )
> {
> png_destroy_read_struct(, , (png_infopp)NULL);
> PODOFO_RAISE_ERROR( ePdfError_InvalidHandle );
> }
>
> long lLen = static_cast(png_get_rowbytes(pPng, pInfo) * height);
> char* pBuffer = static_cast(podofo_calloc(lLen, sizeof(char)));
> ...
> png_bytepp pRows = static_cast(podofo_calloc(height,
> sizeof(png_bytep)));
> ...
> png_read_image(pPng, pRows);
> ...
> podofo_free(pBuffer);
> podofo_free(pRows);
> ```
>
> Test example:
> ```
> // include podofo, vector, etc
> int main()
> {
> PdfMemDocument doc;
> PdfImage image();
> std::vector corrupt_png = {
> 0x89, 'P', 'N', 'G', 0x0D, 0x0A, 0x1A, 0x0A,
> 0x00, 0x00, 0x00, 0x0D,
> 'I', 'H', 'D', 'R',
> 0x00, 0x00, 0x04, 0x00,
> 0x00, 0x00, 0x05, 0xA9,
> 0x08, 0x02, 0x00, 0x00, 0x00,
> 0x68, 0x1B, 0xF7, 0x46,
> 0x00, 0x00, 0x00, 0x00,
> 'I', 'D', 'A', 'T',
> 0x35, 0xAF, 0x06, 0x1E,
> 0x00, 0x00, 0x00, 0x00,
> 'I', 'E', 'N', 'D',
> 0xAE, 0x42, 0x60, 0x82
> };
> try
> {
> image.LoadFromPngData(corrupt_png.data(), corrupt_png.size());
> }
> catch(PdfError )
> {
> e.PrintErrorMsg();
> }
> return 0;
> }
> ```
>
> Test for leaks using valgrind (libpng outputs errors to stderr as it is
> not muted in podofo like libjpeg and libtiff):
> ```
> libpng error: Not enough image data
>
> PoDoFo encountered an error. Error: 2 ePdfError_InvalidHandle
> Error Description: A NULL handle was passed, but initialized data was
> expected.
> Callstack:
> #0 Error Source: podofo/doc/PdfImage.cpp:1142
>
> ==97240==
> ==97240== HEAP SUMMARY:
> ==97240== in use at exit: 4,462,920 bytes in 2 blocks
> ==97240==   total heap usage: 128 allocs, 126 frees, 4,567,527 bytes
> allocated
> ==97240==
> ==97240== 4,462,920 (11,592 direct, 4,451,328 indirect) bytes in 1 blocks
> are definitely lost in loss record 2 of 2
> ==97240==at 0x4C31B25: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==97240==by 0x1759CE: PoDoFo::podofo_calloc(unsigned long, unsigned
> long)
> ==97240==by 0x1E0EE4: PoDoFo::PdfImage::LoadFromPngData(unsigned char
> const*, long)
> ==97240==by 0x125610: main
> ==97240==
> ==97240== LEAK SUMMARY:
> ==97240==definitely lost: 11,592 bytes in 1 blocks
> ==97

Re: [Podofo-users] Strange pdf standard 14 font or something else?

2020-03-20 Thread Michal Sudolsky
>
>
> Hi,
> no, text extracting is not considered rendering, it's not that far from
> it, but it's just reading. I've been missing this in the context you
> gave.
>

Also as PDF really does not have rigid concept of things such as words and
text strings (it only knows glyphs and their positions) then glyph widths
(and positions) is important part of information for text extraction to
work properly. I would not cal it podofo text extraction tool but rather
"character extraction". So to get text from pdf correctly you really need
to "render" it ;)
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Strange pdf standard 14 font or something else?

2020-03-19 Thread Michal Sudolsky
>
>
> > And podofo cannot handle such situation.
>
> Hi,
> I do not have any knowledge or opinion on the matter, but I'm
> wondering: in what way PoDoFo cannot handle such situation, please? As
> long as PoDoFo does not render the content (which it doesn't), then the
> validity or completeness of the font descriptions doesn't matter, as
> long as the library user can get to the objects/dictionaries on his/her
> own, no? The only affected part would be when opening such a PDF file
> and trying to check some string widths or anything like that, which I,
> personally, consider in the same bucket as PDF content rendering.
>
> I always understood PoDoFo as a library to read and manipulate PDF
> files, with some higher API for some common parts, but which are
> optional, because the library user can always use the low level API and
> do all of that on its own. I never understood PoDoFo as a library for
> easier rendering of the PDF content.
>

Is text extraction considered rendering? This is actually not about missing
widths. PdfMemDocument::GetFont throws exception on such font:

PODOFO_RAISE_ERROR_INFO( ePdfError_NoObject, "Font object defines neither
Widths, nor MissingWidth values!" );

So if such font is used on some page podofo text extractor would throw (I
did not check whether mentioned pdf uses this font on page, but it uses it
with acroforms). Podofo refuses to "load" such font which is needed to
decode text which uses this font.

Of course almost everything what is podofo doing wrong can be fixed using
podofo low level api and do these things better. Anyone can create own
better PdfMemDocument based on rest of project. But this will not fix
podofo text extractor tool for example.



> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] Strange pdf standard 14 font or something else?

2020-03-17 Thread Michal Sudolsky
Hi,

Seems there are pdf files like
https://www.courts.state.hi.us/docs/form/oahu/1DC08.pdf which has font
object such as:

464 0 obj
<<
/Name /Helv
/Type /Font
/Subtype /TrueType
/BaseFont /Helvetica
/Encoding 465 0 R
/FontDescriptor 466 0 R
>>
endobj

466 0 obj
<<
/Type /FontDescriptor
/FontName /Arial
/Ascent 1006
/CapHeight 716
/Descent -325
/Flags 32
/FontBBox [ -665 -325 2000 1006 ]
/ItalicAngle 0
/StemV 88
/XHeight 519
>>
endobj

It looks like it should be base 14 font Helvetica but it has subtype
TrueType, FontName in font descriptor is Arial but is missing FontFile,
FontFile2, Widths and MissingWidth. Maybe this has to be treated as
external font? But Widths is required except for base 14 fonts. It looks
like Helvetica in pdf viewers. This font and pdf is definitely corrupted
because base 14 fonts either should have all or none of Widths, FirstChar,
LastChar, FontDescriptor. And other type 1 and truetype fonts should have
all of them according to pdf reference. But more such pdf files exist. Some
have correct Subtype Type1 but still have FontDescriptor without Widths.
And podofo cannot handle such situation.

I am thinking whether should not podofo try to interpret fonts with Subtype
Type1/TrueType with FontDescriptor present but missing Widths and FontFile
or FontFile2 as base 14 fonts.
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] Strings in object streams are decrypted twice for some encryption algorithms

2020-03-12 Thread Michal Sudolsky
Hi,

There is this part of code in PdfObjectStreamParserObject.cpp:

*if*( m_pEncrypt && (m_pEncrypt->GetEncryptAlgorithm() == PdfEncrypt::
ePdfEncryptAlgorithm_AESV2

#ifndef PODOFO_HAVE_OPENSSL_NO_RC4

|| m_pEncrypt->GetEncryptAlgorithm() == PdfEncrypt::
ePdfEncryptAlgorithm_RC4V2

#endif // PODOFO_HAVE_OPENSSL_NO_RC4


  ) )

variantTokenizer.GetNextVariant( var, 0 ); // Stream is already decrypted

*else*

variantTokenizer.GetNextVariant( var, m_pEncrypt );

But document
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
clearly
states in 7.6.1 that strings that are inside object streams are not
encrypted but instead whole stream is encrypted (regardless of used
encryption algorithm). So why is this done correctly only for
ePdfEncryptAlgorithm_RC4V2 and ePdfEncryptAlgorithm_AESV2?

This pdf
https://web.archive.org/web/20200109090442/https://www.ok.gov/tax/documents/bm26.pdf
uses ePdfEncryptAlgorithm_RC4V1
and this code prints rubbish:

PdfMemDocument pdf("bm26.pdf");
printf("%s\n", pdf.GetPage(0)->GetAnnotation(3)->GetTitle().GetString());

Possible output: "?v??=??>?>"

Attached is patch which completely removes m_pEncrypt from
PdfObjectStreamParsetObject as there is nothing which should be decrypted.
Except source stream from which are objects loaded which is decrypted
elsewhere during "m_pParser->GetStream()" (and this m_pParser has its own
m_pEncrypt of course).

Correct output after patch: "Check Type"


decrypt_object_streams_once.patch
Description: Binary data
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [BRANCH] PdfObject automatic ownership handling

2020-03-05 Thread Michal Sudolsky
Also I personally think "FintAt" should also check index and return NULL if
it is higher or equal to array size. In that way it would be easier to use
it on many places and will be more akin to FindKey. Regarding performance
impact this should be negligible because there it still need to check
whether is returned value reference in contrast with faster operator[] of
PdfArray (although also this operator checks for "mutability").

On Thu, Mar 5, 2020 at 9:04 PM Michal Sudolsky  wrote:

> Hi,
>
> Seems there is only one merge conflict due to r1990 which I think could be
> fixed in better way by using new "FindAt" function or similar (I would
> propose to add "MustFindAt" which would check that it does not return NULL
> and also whether is index within array bounds).
>
> On Wed, Feb 26, 2020 at 8:57 AM zyx  wrote:
>
>> On Mon, 2020-02-24 at 20:51 +0100, Francesco Pretto wrote:
>> > I will not be able to fix the conflicts, re-test and commit the patch
>> > myself for at least 2 weeks.
>>
>> Hi,
>> that's okay, there is no hurry. I won't get to it sooner too. Unless
>> you beat me I'll give it a try in around the similar time frame or
>> slightly later (I'm also kind of behind PoDoFo development at the
>> moment), say mid or the second half of this year's March (supposing
>> nothing more urgent won't step in on my side).
>> Thanks and bye,
>> zyx
>>
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [BRANCH] PdfObject automatic ownership handling

2020-03-05 Thread Michal Sudolsky
Hi,

Seems there is only one merge conflict due to r1990 which I think could be
fixed in better way by using new "FindAt" function or similar (I would
propose to add "MustFindAt" which would check that it does not return NULL
and also whether is index within array bounds).

On Wed, Feb 26, 2020 at 8:57 AM zyx  wrote:

> On Mon, 2020-02-24 at 20:51 +0100, Francesco Pretto wrote:
> > I will not be able to fix the conflicts, re-test and commit the patch
> > myself for at least 2 weeks.
>
> Hi,
> that's okay, there is no hurry. I won't get to it sooner too. Unless
> you beat me I'll give it a try in around the similar time frame or
> slightly later (I'm also kind of behind PoDoFo development at the
> moment), say mid or the second half of this year's March (supposing
> nothing more urgent won't step in on my side).
> Thanks and bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] Loading pdf encrypted with password throws exception

2020-03-05 Thread Michal Sudolsky
Hi,

Seems currently the only option how to open pdf encrypted with password is
to try-catch around "Load" function and in case of specific exception call
"SetPassword" to finalise loading. But as name suggests exceptions should
be used for exceptional things which do not occur very often also because
they are relatively slow. But podofo requires exception mechanism for such
regular workflow as opening password protected documents.

Better would be to signal this in different way (return value?) or have
possibility to set password before loading pdf. There is also interesting
function "QuickEncryptedCheck" in parser but it accepts only file names.
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


[Podofo-users] Support indirect references in dictionaries where podofo expects only direct objects

2020-03-04 Thread Michal Sudolsky
Hi,

There are places which expects only direct objects but these objects may be
also indirect. As is stated in
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf at
the end of section 7.3.10 "indirect objects" - any object value may be
direct or indirect reference except in few cases where is explicitly stated
that it either must be direct or indirect like for example keys of
dictionaries, string values in encryption dictionary and some keys in
cross-reference stream dictionary must be direct objects.

For example this pdf
https://courtselfhelp.idaho.gov/docs/forms/CAO_NCA_1-2.pdf contains
"listbox" acroform field with "Opt" key as indirect reference but podofo is
able to handle only direct object here as can be also seen in source
PdfField.cpp for example function PdfListField::GetItemCount. There are
other PDF files which for example use indirect reference for "Rect" field
of page or acroform field. Another example pdf is attached which contains
indirect "Filter" key in stream. It is perfectly valid pdf but podofo has
problems with it for example when trying to draw on page 0 it cannot decode
correctly old content in this stream.

Attached patch fixes cases where podofo expects only direct objects but
they can be also indirect. This is mostly done by replacing
"GetDictionaty().GetKey(...)" with GetIndirectKey or
MustGetIndirectKey. This change should be backward compatible because in
cases GetIndirectKey or MustGetIndirectKey throws in addition (when key
value is reference and either referenced object does not exists or
dictionary does not have owner) then old code would return this reference
and subsequent code typically in form "obj->Get[Name/Array/String/...etc]"
would throw also. Or in other cases avoids what would be invalid and failed
later either.

There are also variants GetKeyAsName, GetKeyAsLong, GetKeyAsReal and
GetKeyAsBool for which I added indirect counterparts like
"GetIndirectKeyAs...".

This patch does not fix bugs where podofo expects direct object in arrays
or during certain enumerations where these values can be also indirect
references. There are few hundred uses of PdfArray indexing and are harder
to find than in case of dictionaries (GetKey). Fixing also these will be
much easier after will be merged automatic object ownership where can be
used "FindAt".

This patch does not fix stream dictionary key "DecodeParms" and its child
objects (like Predictor, Colors, BitsPerComponent, Columns, EarlyChange) as
this will be also easier after automatic object ownership where indirect
key value can be retrieved using "FindKey" called on PdfDictionary.
Currently is not possible to dereference indirect objects from
PdfDictionary and interfaces which accept decode params all use this type
as parameter. I think it is better to fix it using "FindKey" rather than to
change these interfaces to accept PdfObject.

This patch does not fix problems with encryption dictionary because podofo
needs to parse this first before parsing all other objects and expects that
all key values will be direct. But pdf reference states that only string
values within encryption dictionary must be direct objects so all other
which are names, numbers and so on can be also indirect. So fixing this it
not so easy.

There is summary which keys of which dictionaries this patch fixes which
can be also indirect:
- Filter in stream
- Kids in page tree node
- Kids in name tree node
- Names in name tree node
- Limits in name tree node
- MediaBox in page object
- CropBox in page object
- TrimBox in page object
- BleedBox in page object
- ArtBox in page object
- Rotate in page object
- Resources in page object
- Type in page object
- JS in JavaScript action
- BaseEncoding in encoding
- F in file specification
- UF in file specification
- Type in font
- Subtype in font
- MissingWidth in font
- MissingWidth in font descriptor
- D in go-to action
- Width in image
- Height in image
- * in document information
- Version in catalog
- ColorSpace in resource
- AS in annotation
- H in annotation
- F in annotation
- MK in annotation
- AP in annotation
- Rect in annotation
- Contents in annotation
- C in annotation
- Dest in link annotation
- Open in pop-up annotation
- QuadPoints in text markup annotations
- N in appearance
- FT in field
- Ff in field
- V in field
- RV in field
- TM in field
- TU in field
- T in field
- AA in field
- MaxLen in text field
- Opt in choice fields
- AC in appearance characteristics
- RC in appearance characteristics
- CA in appearance characteristics
- NeedAppearances in interactive form
- S in action
- Subtype in annotation
- Type in element
- Flags in font descriptor
- FirstChar in font
- LastChar in font
- DW in font
- FontWeight in font descriptor
- ItalicAngle in font descriptor
- Ascent in font descriptor
- Descent in font descriptor
- Subtype in xobject


indirect_filter.pdf
Description: Adobe PDF document


dict_indirect_objects.patch
Description: Binary data

[Podofo-users] Change GetIndirectKey to MustGetIndirectKey on appropriate places to avoid UB

2020-03-02 Thread Michal Sudolsky
Hi,

There are places where is used GetIndirectKey but the intention was clearly
to use MustGetIndirectKey because returned value is
immediately dereferenced. Also it is better that in case of invalid pdf is
rather thrown exception from MustGetIndirectKey than to crash due to
accessing invalid memory.

Patch attached. Some occurrences are deliberately skipped to avoid merge
conflicts with another upcoming patch.


getindirectkey_to_mustgetindirectkey.patch
Description: Binary data
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-03-02 Thread Michal Sudolsky
>
>
> tabs are harder to spot. What I noticed, by reading, is an extra space
> after 'if' in:
>
> +if ( s_bIgnoreBrokenObjects )
>
> As I said, that's a really minor thing.
>
>
I just ignored this one because it was here also before and the author just
changed "m" to "s".

I contribute to multiple projects and each of them uses slightly
> different coding style, thus I know it's easy to do such things. When
> it comes to it, I do that myself too, more often than I'd like and the
> worse, I even do not notice it, even when re-reading the change before
> committing. I mean, I do not blame anyone, I know these things happen.
> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-28 Thread Michal Sudolsky
>
>
> (and one really minor coding style "violence").
>
>
Do you mean that one added tab?


> I didn't try to build this yet, but I do not expect any problem with
> it. I'll commit this (updated) after Francesco changes land.
> Thanks again and bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-27 Thread Michal Sudolsky
Seems well. Maybe also m_bStrictParsing could be static. Seems there is no
way how to turn on strict parsing (if anyone need this at all).


On Thu, Feb 27, 2020 at 8:20 PM John Senneker 
wrote:

> Good catch. Here’s the patch with that code added back.
>
>
>
> *From:* Michal Sudolsky 
> *Sent:* Thursday, February 27, 2020 2:05 PM
> *To:* John Senneker 
> *Cc:* zyx ; podofo-users 
> *Subject:* Re: [Podofo-users] [External] Re: Patch for ignoring broken
> objects
>
>
>
> Hi,
>
>
>
> There is missing this part of original patch:
>
>
>
> @@ -1304,7 +1306,15 @@
>
>  std::ostringstream oss;
>
>  oss << "Loading of object " << nObjNo << " 0 R failed!" <<
> std::endl;
>
> -PODOFO_RAISE_ERROR_INFO( ePdfError_NoObject, oss.str().c_str() );
>
> +if ( m_bIgnoreBrokenObjects )
>
> +{
>
> +PdfError::LogMessage( eLogSeverity_Error, oss.str().c_str() );
>
> +return;
>
> +}
>
> +else
>
> +{
>
> +PODOFO_RAISE_ERROR_INFO( ePdfError_NoObject,
> oss.str().c_str() );
>
> +}
>
>  }
>
>
>
> On Thu, Feb 27, 2020 at 7:38 PM John Senneker 
> wrote:
>
> Hi zyx,
> Here's a patch that:
> * makes PdfParser::m_bIgnoreBrokenObjects a static member, which can be
> changed by calling the existing setter and getter methods (which are now
> static).
> * removes the code in PdfParser::Init() that set m_bIgnoreBrokenObjects to
> false
> * makes the default for the new static member true
>
> The new patch makes no changes to things other than PdfParser. So the API
> for people who don't want to ignore broken objects would be to call
> PdfParser::SetIgnoreBrokenObjects() before calling PdfMemDocument::Load(),
> or whatever else they're doing.
>
> I think this is what you and Michal were suggesting, but if I've
> misunderstood please let me know!
> --
> JS
>
> -Original Message-
> From: zyx 
> Sent: Thursday, February 27, 2020 1:55 AM
> To: podofo-users 
> Cc: John Senneker 
> Subject: Re: [Podofo-users] [External] Re: Patch for ignoring broken
> objects
>
> On Wed, 2020-02-26 at 19:32 +0100, Michal Sudolsky wrote:
> > Unless someone really needs to use different settings in different
> > threads for some reason (now or in future).
>
> Hi,
> I agree and I'd say it'll be a minority of the users, if any. Let's try
> with the simplest method, with the static variable (and methods to get/set
> the value) in PdfParser.
>
> John, would you mind to update your patch in this regard, please? I'd like
> to give you the credits for the change, as it is your initiative, thus it
> deserves it. Check the recent messages in this thread for the suggested
> changes.
>
> Thanks and bye,
> zyx
>
>  
> Electronic mail messages entering and leaving Arup business systems are
> scanned for viruses and acceptability of content.
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
> <https://secure-web.cisco.com/1-_RdAs0iitkb3OoEgI534mudi4mjJBEoiNoAEMsoUs2YPlr4Ri4UCg6vCfof_hJl4wRGh2MkdIv3S-3GLSPfq_sGLLPYYbsgT9lyqUuYnLDu76ABXblsHpsf9gEaPmXtLX7kH9iCf6IDowMyT2T3SoINnXo3HrTIt7Afy_YaM54ZmDe2KXG_qt5dVbr6OFx6nLIY-rCvx1KuyNm0L_Zbhz1LMUNmc5tynCIEotRlZvrxdcfsRhLNVycjJYn3ayCYB-TSKpal4tDoxODFJkQGQdEk0gYBoUTs4O7ru8s6sLC_ayGyJv_wduGvJzJNelxsRqNpOxm9_a4ofYZmhw3g6Q/https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fpodofo-users>
>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-27 Thread Michal Sudolsky
Hi,

There is missing this part of original patch:

@@ -1304,7 +1306,15 @@
std::ostringstream oss;
oss << "Loading of object " << nObjNo << " 0 R failed!" << std::endl;
- PODOFO_RAISE_ERROR_INFO( ePdfError_NoObject, oss.str().c_str() );
+ if ( m_bIgnoreBrokenObjects )
+ {
+ PdfError::LogMessage( eLogSeverity_Error, oss.str().c_str() );
+ return;
+ }
+ else
+ {
+ PODOFO_RAISE_ERROR_INFO( ePdfError_NoObject, oss.str().c_str() );
+ }
}

On Thu, Feb 27, 2020 at 7:38 PM John Senneker 
wrote:

> Hi zyx,
> Here's a patch that:
> * makes PdfParser::m_bIgnoreBrokenObjects a static member, which can be
> changed by calling the existing setter and getter methods (which are now
> static).
> * removes the code in PdfParser::Init() that set m_bIgnoreBrokenObjects to
> false
> * makes the default for the new static member true
>
> The new patch makes no changes to things other than PdfParser. So the API
> for people who don't want to ignore broken objects would be to call
> PdfParser::SetIgnoreBrokenObjects() before calling PdfMemDocument::Load(),
> or whatever else they're doing.
>
> I think this is what you and Michal were suggesting, but if I've
> misunderstood please let me know!
> --
> JS
>
> -Original Message-
> From: zyx 
> Sent: Thursday, February 27, 2020 1:55 AM
> To: podofo-users 
> Cc: John Senneker 
> Subject: Re: [Podofo-users] [External] Re: Patch for ignoring broken
> objects
>
> On Wed, 2020-02-26 at 19:32 +0100, Michal Sudolsky wrote:
> > Unless someone really needs to use different settings in different
> > threads for some reason (now or in future).
>
> Hi,
> I agree and I'd say it'll be a minority of the users, if any. Let's try
> with the simplest method, with the static variable (and methods to get/set
> the value) in PdfParser.
>
> John, would you mind to update your patch in this regard, please? I'd like
> to give you the credits for the change, as it is your initiative, thus it
> deserves it. Check the recent messages in this thread for the suggested
> changes.
>
> Thanks and bye,
> zyx
>
>  
> Electronic mail messages entering and leaving Arup business systems are
> scanned for viruses and acceptability of content.
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] Arabic characters not displayed properly when AddText API is used.

2020-02-27 Thread Michal Sudolsky
This is just starter - what I would try to do first from curiosity as I
suppose it may actually work in certain circumstances. There may be needed
some additional mapping or other things. For correctly doing this I would
read for example something about CID Fonts and other things about fonts in
pdf (I did not research this much).

On Thu, Feb 27, 2020 at 5:18 PM Michal Sudolsky  wrote:

> Hi,
>
> This podofo thread may be useful:
> https://sourceforge.net/p/podofo/mailman/message/32419281/
>
> So suppose font was added to pdf either by podofo api or manually as
> described there if needed.
>
> You can get glyph ids and positions using HarfBuzz:
>
> 1. You may not be able to use PdfPainter api like DrawText or AddText
> because they all handle unicode strings unless you somehow bypass this and
> makes them accept glyph ids. But character positions may be incorrect
> unless you draw individual glyphs into positions from HarfBuzz.
>
> 2. Glyph ids from HarfBuzz should be what is expected in PDF drawing
> instructions like "Tj" without PdfPainter. You can follow what is doing
> SetPage and DrawText in PdfPainter. First get PdfStream of page contents
> like in SetPage "pdf.GetPage(n)->GetContentsForAppending()->GetStream()".
> Call BeginAppend then Append whitespace and text operators and call
> EndAppend. Like DrawText use operators like this:
>
> BT
> /font_id font_size Tf
> position_x position_y Td
>  Tj
> ET
>
> For example like this with single glyph id 0xAF12 and two glyphs 0x1234
> and 0xABCD (in pdf endian):
>
> BT
> /Ft14 24 Tf
> 10 20 Td
>  Tj
> 30 40 Td
> <1234ABCD> Tj
> ET
>
> You may need to read pdf reference about graphics state operators. If I am
> not wrong "Td" uses relative coordinates so you may little play how to use
> HarfBuzz offsets and "advance" based on their example. For correct
> positioning you may need to draw each glyph individually or check "TJ"
> which allows individual horizontal glyph positioning.
>
> On Thu, Feb 27, 2020 at 2:02 PM Anand Ramasamy  wrote:
>
>> Dear Michal,
>>
>>
>>
>> Thank you for your help in advance.
>>
>>
>>
>> We were able to generate the glyphs and shape it using HarfBuzz libraries.
>>
>>
>>
>> Is there any way to write the generated glyphs as text in PDF document
>> using PoDoFo library?
>>
>>
>>
>> Regards,
>>
>> Anand.
>>
>> *From:* Anand Ramasamy
>> *Sent:* Thursday, February 13, 2020 9:26 PM
>> *To:* Michal Sudolsky 
>> *Cc:* podofo-users@lists.sourceforge.net
>> *Subject:* RE: [Podofo-users] Arabic characters not displayed properly
>> when AddText API is used.
>>
>>
>>
>> Dear Michal,
>>
>>
>>
>> Sorry for the delayed reply.
>>
>>
>>
>> Please let me  know whether there are any chances for Arabic fonts to be
>> supported by Podofo library in future.
>>
>>
>>
>> Regards,
>>
>> Anand.
>>
>> *From:* Michal Sudolsky 
>> *Sent:* Monday, November 18, 2019 7:48 PM
>> *To:* Anand Ramasamy 
>> *Cc:* podofo-users@lists.sourceforge.net
>> *Subject:* Re: [Podofo-users] Arabic characters not displayed properly
>> when AddText API is used.
>>
>>
>>
>> I am getting PdfError at "PdfString str1( ar.c_str() );":
>>
>>
>>
>> PoDoFo encountered an error. Error: 8 ePdfError_InternalLogic
>> Error Description: An internal error occurred.
>> Callstack:
>> #0 Error Source: podofo/base/PdfString.cpp:171
>>
>>
>>
>> My guess is that in order to join these characters you would need to do
>> kerning (https://en.wikipedia.org/wiki/Kerning
>> <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FKerning=02%7C01%7Cramasamy-a%40hcl.com%7C451e8dcbe7b14a7d067b08d76c3216ed%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637096834699540974=dOFcg52fEIhHr7EPlaVs84Bbbce0fOrpeSAlCq%2BKKKs%3D=0>)
>> or use ligatures/alternates (
>> https://en.wikipedia.org/wiki/Orthographic_ligature
>> <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOrthographic_ligature=02%7C01%7Cramasamy-a%40hcl.com%7C451e8dcbe7b14a7d067b08d76c3216ed%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637096834699540974=TNtzRAY1FtjOhi%2FVqzGh6NOR%2FjlNjxSH6Y478zdfhww%3D=0>).
>> From my observation podofo does not support these things:
>>
>>
>>
>> https://sourceforge.net/p/podofo/mailman/message/36260915/
>> <https://apc01.safelinks.protection.outlook.

Re: [Podofo-users] Arabic characters not displayed properly when AddText API is used.

2020-02-27 Thread Michal Sudolsky
Hi,

This podofo thread may be useful:
https://sourceforge.net/p/podofo/mailman/message/32419281/

So suppose font was added to pdf either by podofo api or manually as
described there if needed.

You can get glyph ids and positions using HarfBuzz:

1. You may not be able to use PdfPainter api like DrawText or AddText
because they all handle unicode strings unless you somehow bypass this and
makes them accept glyph ids. But character positions may be incorrect
unless you draw individual glyphs into positions from HarfBuzz.

2. Glyph ids from HarfBuzz should be what is expected in PDF drawing
instructions like "Tj" without PdfPainter. You can follow what is doing
SetPage and DrawText in PdfPainter. First get PdfStream of page contents
like in SetPage "pdf.GetPage(n)->GetContentsForAppending()->GetStream()".
Call BeginAppend then Append whitespace and text operators and call
EndAppend. Like DrawText use operators like this:

BT
/font_id font_size Tf
position_x position_y Td
 Tj
ET

For example like this with single glyph id 0xAF12 and two glyphs 0x1234 and
0xABCD (in pdf endian):

BT
/Ft14 24 Tf
10 20 Td
 Tj
30 40 Td
<1234ABCD> Tj
ET

You may need to read pdf reference about graphics state operators. If I am
not wrong "Td" uses relative coordinates so you may little play how to use
HarfBuzz offsets and "advance" based on their example. For correct
positioning you may need to draw each glyph individually or check "TJ"
which allows individual horizontal glyph positioning.

On Thu, Feb 27, 2020 at 2:02 PM Anand Ramasamy  wrote:

> Dear Michal,
>
>
>
> Thank you for your help in advance.
>
>
>
> We were able to generate the glyphs and shape it using HarfBuzz libraries.
>
>
>
> Is there any way to write the generated glyphs as text in PDF document
> using PoDoFo library?
>
>
>
> Regards,
>
> Anand.
>
> *From:* Anand Ramasamy
> *Sent:* Thursday, February 13, 2020 9:26 PM
> *To:* Michal Sudolsky 
> *Cc:* podofo-users@lists.sourceforge.net
> *Subject:* RE: [Podofo-users] Arabic characters not displayed properly
> when AddText API is used.
>
>
>
> Dear Michal,
>
>
>
> Sorry for the delayed reply.
>
>
>
> Please let me  know whether there are any chances for Arabic fonts to be
> supported by Podofo library in future.
>
>
>
> Regards,
>
> Anand.
>
> *From:* Michal Sudolsky 
> *Sent:* Monday, November 18, 2019 7:48 PM
> *To:* Anand Ramasamy 
> *Cc:* podofo-users@lists.sourceforge.net
> *Subject:* Re: [Podofo-users] Arabic characters not displayed properly
> when AddText API is used.
>
>
>
> I am getting PdfError at "PdfString str1( ar.c_str() );":
>
>
>
> PoDoFo encountered an error. Error: 8 ePdfError_InternalLogic
> Error Description: An internal error occurred.
> Callstack:
> #0 Error Source: podofo/base/PdfString.cpp:171
>
>
>
> My guess is that in order to join these characters you would need to do
> kerning (https://en.wikipedia.org/wiki/Kerning
> <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FKerning=02%7C01%7Cramasamy-a%40hcl.com%7C451e8dcbe7b14a7d067b08d76c3216ed%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637096834699540974=dOFcg52fEIhHr7EPlaVs84Bbbce0fOrpeSAlCq%2BKKKs%3D=0>)
> or use ligatures/alternates (
> https://en.wikipedia.org/wiki/Orthographic_ligature
> <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOrthographic_ligature=02%7C01%7Cramasamy-a%40hcl.com%7C451e8dcbe7b14a7d067b08d76c3216ed%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637096834699540974=TNtzRAY1FtjOhi%2FVqzGh6NOR%2FjlNjxSH6Y478zdfhww%3D=0>).
> From my observation podofo does not support these things:
>
>
>
> https://sourceforge.net/p/podofo/mailman/message/36260915/
> <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceforge.net%2Fp%2Fpodofo%2Fmailman%2Fmessage%2F36260915%2F=02%7C01%7Cramasamy-a%40hcl.com%7C451e8dcbe7b14a7d067b08d76c3216ed%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637096834699550969=igA8%2Bc9X0oFnW8lyJRQIILOqlePIv4OdnlyMUZGrHKQ%3D=0>
>
> https://sourceforge.net/p/podofo/mailman/message/36683139/
> <https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceforge.net%2Fp%2Fpodofo%2Fmailman%2Fmessage%2F36683139%2F=02%7C01%7Cramasamy-a%40hcl.com%7C451e8dcbe7b14a7d067b08d76c3216ed%7C189de737c93a4f5a8b686f4ca9941912%7C0%7C0%7C637096834699550969=UTqo02i%2FFKMf264jZ9aAA%2F1hYIgU35t8aUN1I8MDMqo%3D=0>
>
>
>
> From what I googled indicates that seems truetype fonts do not support
> ligatures but opentype fonts do support them. And seems podofo does not
> support opentype fonts (also from my experience regarding font subsetting):
>
>
>
> htt

Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-26 Thread Michal Sudolsky
>
>
> Hi,
> just to make things clearer, I also do not like global variables as
> such. What I've been thinking of was to provide something similar to
> PdfError::EnableLogging() and PdfError::EnableDebug(). That would work
> fine, no?
> Bye,
> zyx
>
>
There is little difference. Logging here is really global as any function
or class method can log. Ignoring broken objects is local to parser. But as
I am thinking it could be good also as global setting as I even cannot
devise situation where I would want to not be able to open such broken
pdfs. (Maybe it is good for checking generated pdfs in podofo tests?).
Unless someone really needs to use different settings in different threads
for some reason (now or in future).



>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-26 Thread Michal Sudolsky
>
>
> Eventually, it could be a global (static) property of the PdfParser, to
> make things even simpler. Would there be a need to use PdfParser from
> multiple threads with different setting at the same time? I guess not.
> This would make the patch much smaller.
>
> Ideas?
>

I think better would be to not use global variable here regardless whether
has sense to use different setting from different threads.

Thanks and bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [BRANCH] PdfObject automatic ownership handling

2020-02-24 Thread Michal Sudolsky
>
> (to make
> it easier to merge your giant change, to deal with as less conflicts as
> possible).
>

It would be easier with git. Also probably more people would collaborate as
it is easier system than sending patches via email where merge conflicts
are up to you.


>
> Are you still willing to merge it yourself, please?
>
> My idea is that you know the change the best (even after so many
> months, I'm sorry), thus I hoped you'd finally merge the change
> yourself too. If not, then no problem, just let me know (through the
> list) and I'll do it myself.
>
> Thanks and bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-21 Thread Michal Sudolsky
Maybe I would personally did not add new parameter but instead tried to fix
it in way that value passed to SetIgnoreBrokenObjects() is not overwritten
in Init() and make true the default. But I am ok also with current patch so
I do not plan to do this. If someone wants he/she can do.


On Fri, Feb 21, 2020 at 7:45 PM Michal Sudolsky  wrote:

> Hi,
>
> I do not think that this patch is bad. It fixes problems and can be
> amended also later to ignore broken objects by default so there is no
> reason to not apply it in its current form. Except that someone does not
> like new parameter in "Load" functions but wants to turn it on or off using
> some class member method.
>
> On Thu, Feb 20, 2020 at 8:27 AM zyx  wrote:
>
>> On Wed, 2020-02-19 at 16:57 +0100, Michal Sudolsky wrote:
>> > I think ignoring broken objects should be the default behaviour
>>
>> Hi,
>> I can second that, it makes perfect sense.
>>
>> Thanks for the corrected review of the patch. I only briefly read it,
>> which did not reveal any obvious problem. I didn't test it in action.
>> My fault.
>>
>> Bye,
>> zyx
>>
>>
>>
>> ___
>> Podofo-users mailing list
>> Podofo-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/podofo-users
>>
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


Re: [Podofo-users] [External] Re: Patch for ignoring broken objects

2020-02-21 Thread Michal Sudolsky
Hi,

I do not think that this patch is bad. It fixes problems and can be amended
also later to ignore broken objects by default so there is no reason to not
apply it in its current form. Except that someone does not like new
parameter in "Load" functions but wants to turn it on or off using some
class member method.

On Thu, Feb 20, 2020 at 8:27 AM zyx  wrote:

> On Wed, 2020-02-19 at 16:57 +0100, Michal Sudolsky wrote:
> > I think ignoring broken objects should be the default behaviour
>
> Hi,
> I can second that, it makes perfect sense.
>
> Thanks for the corrected review of the patch. I only briefly read it,
> which did not reveal any obvious problem. I didn't test it in action.
> My fault.
>
> Bye,
> zyx
>
>
>
> ___
> Podofo-users mailing list
> Podofo-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/podofo-users
>
___
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users


  1   2   >