Re: NVD update hangs during release build

2024-03-21 Thread sahy...@fileaffairs.de
OK - can replicate the issue too. works for me locally up to
dependency-check-maven 8.4.3 - would that be an option?

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 17:38 +0100 schrieb Tilman Hausherr:
> add
> 
> -Ppedantic
> 
> Tilman
> 
> On 21.03.2024 17:28, sahy...@fileaffairs.de wrote:
> > which mvn cmd do in need to issue to trigger the check? mvn clean
> > install didn't for me. Am I missing something?
> > 
> > BR
> > Maruan
> > 
> > Am Donnerstag, dem 21.03.2024 um 17:24 +0100 schrieb Tilman
> > Hausherr:
> > > Jeremy Long wrote something that I haven't really understood.
> > > Maybe
> > > it
> > > means building the NVD archive on a separate system and then
> > > transferring it.
> > > 
> > > https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2011824975
> > > 
> > > However a leter message in the same issue made more sense, I'm
> > > testing
> > > locally with
> > > 
> > > https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/
> > > 
> > > 
> > > Tilman
> > > 
> > > On 21.03.2024 09:48, sahy...@fileaffairs.de wrote:
> > > > Mhmm - is there a way to build locally and test the NVD update?
> > > > 
> > > > Ran it on a different project I have for a client locally and
> > > > NVD
> > > > update worked without issues and without an API key.
> > > > 
> > > > BR
> > > > Maruan
> > > > 
> > > > Am Donnerstag, dem 21.03.2024 um 08:36 +0100 schrieb Tilman
> > > > Hausherr:
> > > > > I meant adding true to the  part.
> > > > > 
> > > > > Something isn't ok with NVD, maybe it got worse since then:
> > > > > https://blog.fefe.de/?ts=9b0740e0
> > > > > https://www.heise.de/news/Sicherheitsforscher-genervt-Luecken-Datenbank-NVD-seit-Wochen-unvollstaendig-9656574.html
> > > > > 
> > > > > Tilman
> > > > > 
> > > > > On 20.03.2024 22:05, Andreas Lehmkühler wrote:
> > > > > > Am 20.03.24 um 21:16 schrieb Tilman Hausherr:
> > > > > > > If you still have the time, you could add a "skip" for
> > > > > > > that
> > > > > > > plugin;
> > > > > > > the last successful build was this morning and no library
> > > > > > > changes
> > > > > > > were made since then. (and we still have a few days to
> > > > > > > find
> > > > > > > out
> > > > > > > if
> > > > > > > any libraries are now considered risky)
> > > > > > Good idea, but -Ddependency-check.skip=true doesn't work
> > > > > > either, it
> > > > > > still tries to update :-(
> > > > > > 
> > > > > > I'm going to continue tomorrow 
> > > > > > 
> > > > > > Andreas
> > > > > > 
> > > > > > > Tilman
> > > > > > > 
> > > > > > > On 20.03.2024 21:13, Tilman Hausherr wrote:
> > > > > > > > Seems it's a general problem:
> > > > > > > > https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2009879851
> > > > > > > >    
> > > > > > > > 
> > > > > > > > 
> > > > > > > > it also hangs on my local machine now, I don't have an
> > > > > > > > API
> > > > > > > > key.
> > > > > > > > 
> > > > > > > > Tilman
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 20.03.2024 20:57, Andreas Lehmkühler wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > I'm trying to cut the 2.0.31 release but it always
> > > > > > > > > hangs
> > > > > > > > > when
> > > > > > > > > the
> > > > > > > > > build tries to update the NVD data.
> > > > > > > > > 
> > > > > > > > > Last week when I built the 3.0.2 release I had a
> > > > > > > > > similar
> > > > > > > > > effect.
>

Re: NVD update hangs during release build

2024-03-21 Thread sahy...@fileaffairs.de
which mvn cmd do in need to issue to trigger the check? mvn clean
install didn't for me. Am I missing something?

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 17:24 +0100 schrieb Tilman Hausherr:
> Jeremy Long wrote something that I haven't really understood. Maybe
> it 
> means building the NVD archive on a separate system and then 
> transferring it.
> 
> https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2011824975
> 
> However a leter message in the same issue made more sense, I'm
> testing 
> locally with
> 
> https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/
> 
> 
> Tilman
> 
> On 21.03.2024 09:48, sahy...@fileaffairs.de wrote:
> > Mhmm - is there a way to build locally and test the NVD update?
> > 
> > Ran it on a different project I have for a client locally and NVD
> > update worked without issues and without an API key.
> > 
> > BR
> > Maruan
> > 
> > Am Donnerstag, dem 21.03.2024 um 08:36 +0100 schrieb Tilman
> > Hausherr:
> > > I meant adding true to the  part.
> > > 
> > > Something isn't ok with NVD, maybe it got worse since then:
> > > https://blog.fefe.de/?ts=9b0740e0
> > > https://www.heise.de/news/Sicherheitsforscher-genervt-Luecken-Datenbank-NVD-seit-Wochen-unvollstaendig-9656574.html
> > > 
> > > Tilman
> > > 
> > > On 20.03.2024 22:05, Andreas Lehmkühler wrote:
> > > > 
> > > > Am 20.03.24 um 21:16 schrieb Tilman Hausherr:
> > > > > If you still have the time, you could add a "skip" for that
> > > > > plugin;
> > > > > the last successful build was this morning and no library
> > > > > changes
> > > > > were made since then. (and we still have a few days to find
> > > > > out
> > > > > if
> > > > > any libraries are now considered risky)
> > > > Good idea, but -Ddependency-check.skip=true doesn't work
> > > > either, it
> > > > still tries to update :-(
> > > > 
> > > > I'm going to continue tomorrow 
> > > > 
> > > > Andreas
> > > > 
> > > > > Tilman
> > > > > 
> > > > > On 20.03.2024 21:13, Tilman Hausherr wrote:
> > > > > > Seems it's a general problem:
> > > > > > https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2009879851
> > > > > >   
> > > > > > 
> > > > > > 
> > > > > > it also hangs on my local machine now, I don't have an API
> > > > > > key.
> > > > > > 
> > > > > > Tilman
> > > > > > 
> > > > > > 
> > > > > > On 20.03.2024 20:57, Andreas Lehmkühler wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I'm trying to cut the 2.0.31 release but it always hangs
> > > > > > > when
> > > > > > > the
> > > > > > > build tries to update the NVD data.
> > > > > > > 
> > > > > > > Last week when I built the 3.0.2 release I had a similar
> > > > > > > effect.
> > > > > > > The update was very slow but in the end it came to an end
> > > > > > > worked.
> > > > > > > 
> > > > > > > Now, nothing happens, the last words are
> > > > > > > 
> > > > > > > [INFO] [WARNING] An NVD API Key was not provided - it is
> > > > > > > highly
> > > > > > > recommended to use an NVD API key as the update can take
> > > > > > > a
> > > > > > > VERY
> > > > > > > long time without an API Key
> > > > > > > 
> > > > > > > nothing more after that. It simply hangs
> > > > > > > 
> > > > > > > I've requested an api key, got one and now I'm trying to
> > > > > > > get
> > > > > > > it
> > > > > > > work, but it doesn't.
> > > > > > > 
> > > > > > > I've tried
> > > > > > > 
> > > > > > > * the mvn option -DnvdApiKey=
> > > > > > > * define a server "nvd" in .m2/settings.xml including the
> > > > > > > key
> > > > >

Re: NVD update hangs during release build

2024-03-21 Thread sahy...@fileaffairs.de
Mhmm - is there a way to build locally and test the NVD update?

Ran it on a different project I have for a client locally and NVD
update worked without issues and without an API key.

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 08:36 +0100 schrieb Tilman Hausherr:
> I meant adding true to the  part.
> 
> Something isn't ok with NVD, maybe it got worse since then:
> https://blog.fefe.de/?ts=9b0740e0
> https://www.heise.de/news/Sicherheitsforscher-genervt-Luecken-Datenbank-NVD-seit-Wochen-unvollstaendig-9656574.html
> 
> Tilman
> 
> On 20.03.2024 22:05, Andreas Lehmkühler wrote:
> > 
> > 
> > Am 20.03.24 um 21:16 schrieb Tilman Hausherr:
> > > If you still have the time, you could add a "skip" for that
> > > plugin; 
> > > the last successful build was this morning and no library changes
> > > were made since then. (and we still have a few days to find out
> > > if 
> > > any libraries are now considered risky)
> > Good idea, but -Ddependency-check.skip=true doesn't work either, it
> > still tries to update :-(
> > 
> > I'm going to continue tomorrow 
> > 
> > Andreas
> > 
> > > 
> > > Tilman
> > > 
> > > On 20.03.2024 21:13, Tilman Hausherr wrote:
> > > > Seems it's a general problem:
> > > > https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2009879851
> > > >  
> > > > 
> > > > 
> > > > it also hangs on my local machine now, I don't have an API key.
> > > > 
> > > > Tilman
> > > > 
> > > > 
> > > > On 20.03.2024 20:57, Andreas Lehmkühler wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm trying to cut the 2.0.31 release but it always hangs when
> > > > > the 
> > > > > build tries to update the NVD data.
> > > > > 
> > > > > Last week when I built the 3.0.2 release I had a similar
> > > > > effect. 
> > > > > The update was very slow but in the end it came to an end
> > > > > worked.
> > > > > 
> > > > > Now, nothing happens, the last words are
> > > > > 
> > > > > [INFO] [WARNING] An NVD API Key was not provided - it is
> > > > > highly 
> > > > > recommended to use an NVD API key as the update can take a
> > > > > VERY 
> > > > > long time without an API Key
> > > > > 
> > > > > nothing more after that. It simply hangs
> > > > > 
> > > > > I've requested an api key, got one and now I'm trying to get
> > > > > it 
> > > > > work, but it doesn't.
> > > > > 
> > > > > I've tried
> > > > > 
> > > > > * the mvn option -DnvdApiKey=
> > > > > * define a server "nvd" in .m2/settings.xml including the key
> > > > > and 
> > > > > add -DnvdApiServerId=nvd  to the commandline
> > > > > * define the environment variable NVD_API_KEY and add 
> > > > > -DnvdApiKeyEnvironmentVariable=NVD_API_KEY to the commandline
> > > > > 
> > > > > Nothing works, I've always got those famous words: An NVD API
> > > > > Key 
> > > > > was not provide 
> > > > > 
> > > > > 
> > > > > Any idea to get around this?
> > > > > 
> > > > > Andreas
> > > > > 
> > > > > P.S.: I'm on linux using coretto-8.332 and mvn 3.9.3
> > > > > 
> > > > > 
> > > > > -
> > > > > 
> > > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > > 
> > > > 
> > > > 
> > > > ---
> > > > --
> > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report January 2024 due

2024-01-07 Thread sahy...@fileaffairs.de
+1 

Maruan

Am Montag, dem 08.01.2024 um 08:14 +0100 schrieb Andreas Lehmkühler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit 
> this month. It's based upon the report wizard template which can be 
> found at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software 
> related to
> Java library for working with PDF documents
> 
> ## Project Status:
> Current project status: ongoing with moderate activity
> Issues for the board: none
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (14 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  3.0.1 was released on 2023-11-30.
>  2.0.30 was released on 2023-11-04.
>  3.0.0 was released on 2023-08-17.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions 
> on the mailing lists
> - we released the first minor release of our new 3.0.x line to fix
> some 
> regression issues. A couple of improvements and further fixes were 
> included as well.
> - the development of the current trunk version 4.0.0 is an ongoing 
> effort, e.g. we switched to Log4j2 and did some major refactorings
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Set of tests documents for PDF parsing

2023-12-13 Thread sahy...@fileaffairs.de
Hi,

I came across a public repositiory from the PDF Association to help
testing lexical analysis of PDFs for testing the parsing.

https://github.com/pdf-association/safedocs

Might help to use some in our testbed.

BR
Maruan

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Vertical cut off in AcroForm text fields

2023-12-13 Thread sahy...@fileaffairs.de
PDFBox calculates the font size to be 14.6036 where Acrobat calculates
8.931. The placement of the text start is also different.

Need to check.

BR
Maruan 


Am Montag, dem 11.12.2023 um 21:01 +0100 schrieb Tilman Hausherr:
> He has posted the file:
> https://drive.google.com/file/d/1BaU6iR4yIQAT-oQxGKFnsiIkLjm54GCE/view?usp=sharing
> 
> I can confirm the effect. Adobe shows a smaller font when clicking on
> it. It also happens when I set the field to a shorter text
> 
>  PDDocument doc = Loader.loadPDF(new
> File("/unflattened.pdf"));
>  PDTextField field = (PDTextField) 
> doc.getDocumentCatalog().getAcroForm().getField("homePsdMunicipality"
> );
>  field.setValue("pg");
>  doc.save(new File("...../unflattenednew.pdf"));
> 
> 
> 
> Tilman
> 
> On 28.11.2023 21:38, sahy...@fileaffairs.de wrote:
> > we'd need the sample PDF and compare with what Acrobat does. But at
> > certain sizes a clipping might occur as the scaling ends at a
> > certain
> > size. That's also what Acrobat does.
> > 
> > All that stuff is based on looking at Acrobat generated samples and
> > trying to reproduce that in PDFBox. There is no detailed
> > documentation
> > about the layout model as this is not part of the PDF spec.
> > 
> > BR
> > Maruan
> > 
> > Am Dienstag, dem 28.11.2023 um 21:03 +0100 schrieb Tilman Hausherr:
> > > https://stackoverflow.com/questions/77558220/apache-pdfbox-gray-faint-underscores-2-0-26
> > > 
> > > I'm wondering if this is a bug or a feature / misunderstanding,
> > > that
> > > an
> > > autosize acroform text field may cut off glyphs vertically in
> > > some
> > > cases?
> > > 
> > > Tilman
> > > 
> > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: A question about the code to set the default value in PDNonTerminalField.java ...

2023-12-12 Thread sahy...@fileaffairs.de
having an account to file an issue would be good. You might want to
wait until something else pops up.

Thank you very much for your report.

BR
Maruan

Am Dienstag, dem 12.12.2023 um 08:17 -0600 schrieb Dwayne Parks:
> I just saw that you have already filed a bug for this.  That's great!
> 
> I'm expecting to be using PDFBox more in the future and would be
> happy 
> to at least report any issues that I find here.  However, if it makes
> more sense for me to go ahead and get a Jira account at Apache.org, I
> can do that so that I can email this list and file an issue right
> away 
> if that is the recommendation.
> 
> FYI, I'd likely do so on my personal account (on GMail) as I might
> have 
> time outside of work to contribute more than just reporting on
> issues.
> 
> Would it be recommended to go ahead and get the Apache Jira account
> set 
> up?  Thanks!
> 
> - Dwayne
> 
> On 12/12/2023 7:51 AM, Dwayne Parks wrote:
> > Will do! Thanks!
> > 
> > On 12/11/2023 3:43 PM, sahy...@fileaffairs.de wrote:
> > > Hi,
> > > 
> > > Am Montag, dem 11.12.2023 um 15:23 -0600 schrieb Dwayne Parks:
> > > > Hi there!
> > > > 
> > > > I was glancing through the code to PDFBox 3.0.1 to better grok
> > > > PDF
> > > > form
> > > > fields/widgets and the hierarchical way they are organized and
> > > > I ran
> > > > across something that might be a bug in the code.
> > > > 
> > > > Or... it might just be my lack of understanding of how PDF
> > > > default
> > > > values work in non-terminal field objects.  At the very least I
> > > > found
> > > > it
> > > > surprising and different from the code in other types of
> > > > PDField
> > > > subclasses.  So I decided to bring it up here on the dev
> > > > mailing
> > > > list.
> > > > 
> > > > In the PDNonTerminalField.java file, the setDefaultValue()
> > > > method
> > > > logic
> > > > looks like this [1]:
> > > > 
> > > >  getCOSObject().setItem(COSName.V, value);
> > > > 
> > > > It appears to set the value of the COSName.V item...  while the
> > > > getDefaultValue() method in the class (and the
> > > > setDefaultValue()
> > > > methods
> > > > in other PDField subclasses that I checked) use the COSName.DV
> > > > value
> > > > as
> > > > expected.
> > > > 
> > > > Is this a bug, or is this intentional?
> > > 
> > > that's a bug - could you file a bug report for that?
> > > 
> > > BR
> > > Maruan
> > > 
> > > > 
> > > > Thank you for your time,
> > > > 
> > > > - Dwayne
> > > > 
> > > > 
> > > > [1]
> > > > https://github.com/apache/pdfbox/blob/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/interactive/form/PDNonTerminalField.java#L258
> > > > 
> > > > 
> > > > ---
> > > > --
> > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: A question about the code to set the default value in PDNonTerminalField.java ...

2023-12-12 Thread sahy...@fileaffairs.de
I've opened https://issues.apache.org/jira/browse/PDFBOX-5735 for that

BR
Maruan

Am Dienstag, dem 12.12.2023 um 10:02 +0100 schrieb Tilman Hausherr:
> 
> > that's a bug - could you file a bug report for that?
> 
> If you have to sign up with JIRA because you're not a member yet, use
> the subject of this post for the field at the bottom.
> 
> https://issues.apache.org/jira/browse/PDFBOX
> 
> Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: A question about the code to set the default value in PDNonTerminalField.java ...

2023-12-11 Thread sahy...@fileaffairs.de
Hi,

Am Montag, dem 11.12.2023 um 15:23 -0600 schrieb Dwayne Parks:
> Hi there!
> 
> I was glancing through the code to PDFBox 3.0.1 to better grok PDF
> form 
> fields/widgets and the hierarchical way they are organized and I ran 
> across something that might be a bug in the code.
> 
> Or... it might just be my lack of understanding of how PDF default 
> values work in non-terminal field objects.  At the very least I found
> it 
> surprising and different from the code in other types of PDField 
> subclasses.  So I decided to bring it up here on the dev mailing
> list.
> 
> In the PDNonTerminalField.java file, the setDefaultValue() method
> logic 
> looks like this [1]:
> 
>  getCOSObject().setItem(COSName.V, value);
> 
> It appears to set the value of the COSName.V item...  while the 
> getDefaultValue() method in the class (and the setDefaultValue()
> methods 
> in other PDField subclasses that I checked) use the COSName.DV value
> as 
> expected.
> 
> Is this a bug, or is this intentional?

that's a bug - could you file a bug report for that?

BR
Maruan

> 
> Thank you for your time,
> 
> - Dwayne
> 
> 
> [1] 
> https://github.com/apache/pdfbox/blob/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/interactive/form/PDNonTerminalField.java#L258
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Vertical cut off in AcroForm text fields

2023-12-11 Thread sahy...@fileaffairs.de
thx - will take a look Wednesday - fully booked tomorrow

Am Montag, dem 11.12.2023 um 21:01 +0100 schrieb Tilman Hausherr:
> He has posted the file:
> https://drive.google.com/file/d/1BaU6iR4yIQAT-oQxGKFnsiIkLjm54GCE/view?usp=sharing
> 
> I can confirm the effect. Adobe shows a smaller font when clicking on
> it. It also happens when I set the field to a shorter text
> 
>  PDDocument doc = Loader.loadPDF(new
> File("/unflattened.pdf"));
>  PDTextField field = (PDTextField) 
> doc.getDocumentCatalog().getAcroForm().getField("homePsdMunicipality"
> );
>  field.setValue("pg");
>  doc.save(new File("...../unflattenednew.pdf"));
> 
> 
> 
> Tilman
> 
> On 28.11.2023 21:38, sahy...@fileaffairs.de wrote:
> > we'd need the sample PDF and compare with what Acrobat does. But at
> > certain sizes a clipping might occur as the scaling ends at a
> > certain
> > size. That's also what Acrobat does.
> > 
> > All that stuff is based on looking at Acrobat generated samples and
> > trying to reproduce that in PDFBox. There is no detailed
> > documentation
> > about the layout model as this is not part of the PDF spec.
> > 
> > BR
> > Maruan
> > 
> > Am Dienstag, dem 28.11.2023 um 21:03 +0100 schrieb Tilman Hausherr:
> > > https://stackoverflow.com/questions/77558220/apache-pdfbox-gray-faint-underscores-2-0-26
> > > 
> > > I'm wondering if this is a bug or a feature / misunderstanding,
> > > that
> > > an
> > > autosize acroform text field may cut off glyphs vertically in
> > > some
> > > cases?
> > > 
> > > Tilman
> > > 
> > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Vertical cut off in AcroForm text fields

2023-11-28 Thread sahy...@fileaffairs.de
we'd need the sample PDF and compare with what Acrobat does. But at
certain sizes a clipping might occur as the scaling ends at a certain
size. That's also what Acrobat does.

All that stuff is based on looking at Acrobat generated samples and
trying to reproduce that in PDFBox. There is no detailed documentation
about the layout model as this is not part of the PDF spec.

BR
Maruan

Am Dienstag, dem 28.11.2023 um 21:03 +0100 schrieb Tilman Hausherr:
> https://stackoverflow.com/questions/77558220/apache-pdfbox-gray-faint-underscores-2-0-26
> 
> I'm wondering if this is a bug or a feature / misunderstanding, that
> an 
> autosize acroform text field may cut off glyphs vertically in some
> cases?
> 
> Tilman
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 3.0.1

2023-11-27 Thread sahy...@fileaffairs.de
+1

Maruan

Am Montag, dem 27.11.2023 um 17:46 +0100 schrieb Andreas Lehmkühler:
> Hi,
> 
> a candidate for the PDFBox 3.0.1 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/3.0.1/
> 
> The release candidate is a zip archive of the sources in:
> 
>  https://svn.apache.org/repos/asf/pdfbox/tags/3.0.1/
> 
> The SHA-512 checksum of the archive is 
> 8ca8f3297ec04efaa23ab6d9ca421c1b39d8fb2de392e0f7b5aa6e7053eac75066e8b
> 2872dc6b6847a0194b557aa8570de7f1d1a122fcf3888bf9ed21eae0257.
> 
> Please vote on releasing this package as Apache PDFBox 3.0.1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 3.0.1
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.30

2023-11-04 Thread sahy...@fileaffairs.de
+1
Maruan

Am Mittwoch, dem 01.11.2023 um 20:23 +0100 schrieb Andreas Lehmkühler:
> Hi,
> 
> a candidate for the PDFBox 2.0.30 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.30/
> 
> The release candidate is a zip archive of the sources in:
> 
>  https://svn.apache.org/repos/asf/pdfbox/tags/2.0.30/
> 
> The SHA-512 checksum of the archive is 
> c1e66695af16396f6a36d02972270651a4630b36799e1fe13262c5748b18cfcbb4682
> 9c847ab4993832018f5f8a0546eb468cafdb36019314e275351569d52cc.
> 
> Please vote on releasing this package as Apache PDFBox 2.0.30.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 2.0.30
>  [ ] -1 Do not release this package because...
> 
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-14 Thread sahy...@fileaffairs.de
Am Samstag, dem 14.10.2023 um 13:36 +0200 schrieb Tilman Hausherr:
> On 11.10.2023 07:53, sahy...@fileaffairs.de wrote:
> > With regards to versioning I'd like to propose that we have 2.0 as
> > LTS 
> > and 4.x being the next LTS.

although both projects can not be compared maybe we can adopt something
similar
https://camel.apache.org/blog/2020/03/LTS-Release-Schedule/

e.g. we do a 4.0.0 as LTS
4.1.0, 4.2.0 ... will not be 4.3.0 might 

Where 4.x are feature releases with faster increments to what we do
today. E.g. we might not implement all PDF 2.0 bits in 4.0 but maybe
only for certain parts like annotations, or forms or signature algos ..
and then add the next bits in 4.1 and so on planning the boundaries
upfront.

This way instead of doing a big release like 3.0 after a very long
period of time we have faster cycles. 

Bug fixes will only be done to LTS e.g. in my sample above 4.0 will
have patch releases like 4.0.1, 4.0.2 ... but 4.1 will not. We can also
limit the lifetime for LTS to a certain (not too long) period of time.
Depends on how fast we think me can do minor releases. LTS would
receive bug fixes only - no new features. So for users with less
critical applications they can adopt quicker. And because we have less
new stuff we limit the risk of breaking things.

>From that perspective we need to decide for 4.0 what will be breaking
changes. Others can be done either in 4.0 or later. E.g. apparance
handlers for form widgets can be done in a non breaking but additive
manner.

BR
Maruan


> 
> Yes, very good idea.
> 
> Tilman
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-13 Thread sahy...@fileaffairs.de
Am Freitag, dem 13.10.2023 um 08:23 +0200 schrieb Andreas Lehmkühler:
> 
> 
> Am 13.10.23 um 04:40 schrieb axh:
> > Hi,
> > 
> > I suggest to also revisit logging. Last week I opened an issue for
> > that (PDFBOX-5695
> > <https://issues.apache.org/jira/browse/PDFBOX-5695>), but it seems
> > everybody is tired by this subject and no none even looked at it.
> > Nonetheless, please take a look. The last time a switch to a
> > logging facade was proposed (and rejected) has been 10 years ago. I
> > think it is worth reconsidering, and a new major release would be
> > the right time to do a change like that. More details in the issue.
> > 
> Please don't give upto early on us. We are all volunteers with
> limited 
> time and different priorities.
> 
> > Whatever the project decides, I am willing to contribute the
> > required patch(es).
> We highly appreciate that.
> 
> I personally don't have the pressure to switch the logging framework
> but 
> I see it is long overdue to overhauil that part of PDFBox.
> 
> I tend to agree with Tilman and I'd like to use log4j2. I hope I'll
> find 
> some time to comment on your proposal at the next weekend.

+1 to switch to log4j2. Benefits described in the ticket. log4j vs
slf4j -> because of Apache License and "family" although license wise
MIT would be compliant AFAIU. 

Maruan 

> 
> 
> Andreas
> 
> > 
> > Cheers,
> > Axel
> > 
> > > Am 11.10.2023 um 07:53 schrieb sahy...@fileaffairs.de:
> > > 
> > > Dear colleagues,
> > > 
> > > with 3.0 being released and 4.0 being started I'd like to start
> > > discussing what the major plans are for 4.0. And maybe in a way
> > > that
> > > the release can be made faster than what we had for 3.0. (maybe
> > > size it
> > > in a way that we can do the dev stuff by spring 2024 and then
> > > release
> > > in summer 2024 followed by a 4.1 release to add to that instead
> > > of
> > > doing a big bang like 3.0)
> > > 
> > > Shall we share some ideas via the mailing list or start a page on
> > > our
> > > website (I think ml is easier to do). We can still document the
> > > major
> > > initiatives as soon as we have agreed in a blog post.
> > > 
> > > Here are my current thoughts (some of which might also be
> > > backported to
> > > 3.0) in no particular order
> > > 
> > > - appareance stream handlers for interactive form widgets
> > > (similar to
> > > what we have for annotations) also allowing one to add their own
> > > handler
> > > - replacement or at least new base for XMPBox (current thought is
> > > to
> > > have a new base parser and add if possible XMPBox current end
> > > user api
> > > on top - might be able to reuse xmlgraphics XMP lib). Would allow
> > > to
> > > better deal with XMPs which are not standard and make it easier
> > > to add
> > > to existing XMPs low level.
> > > - then we had the discussion about an event handler/listener
> > > similar to
> > > what fop provides so one can listen to corrections/repairs done
> > > under
> > > the hood (I know that we can only lay the ground for that as this
> > > is a
> > > major undertaking given all the places where we correct things)
> > > - enhance the parsing to keep the information about incremental
> > > versions (better debugging, trace of changes done ...)
> > > - review and add some more PDF 2.0 capabilities
> > > - better text formatting/language support (maybe by including fop
> > > parts
> > > or looking into using HarfBuzz)
> > > - I'd also like to discuss reaching out to fop to look at
> > > integrating
> > > some of their font handling into fontbox
> > > ...
> > > 
> > > That list is already long and I think would be too much given
> > > above
> > > idea of release planning.
> > > 
> > > With regards to versioning I'd like to propose that we have 2.0
> > > as LTS
> > > and 4.x being the next LTS.
> > > 
> > > Thoughts
> > > BR
> > > Maruan
> > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 4.0 and development plans

2023-10-10 Thread sahy...@fileaffairs.de
Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0) 

Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.

Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)
- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)
- review and add some more PDF 2.0 capabilities
- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox
...

That list is already long and I think would be too much given above
idea of release planning.

With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.

Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 3.0.0

2023-08-16 Thread sahy...@fileaffairs.de
+1
Maruan

Am Montag, dem 14.08.2023 um 20:29 +0200 schrieb Andreas Lehmkühler:
> Hi,
> 
> a candidate for the PDFBox 3.0.0 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0/
> 
> The release candidate is a zip archive of the sources in:
> 
>  https://svn.apache.org/repos/asf/pdfbox/tags/3.0.0/
> 
> The SHA-512 checksum of the archive is 
> 279f283f8f97e3adb5e58546f6242b495eef26dacfc256129f790064a73934f16ceb0
> a7a9164293d506fc0fff462783d296b844611ed18e12b9de0f1724294b5.
> 
> Please vote on releasing this package as Apache PDFBox 3.0.0.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 3.0.0
>  [ ] -1 Do not release this package because...
> 
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 2.0.29 release?

2023-05-25 Thread sahy...@fileaffairs.de
+1

Maruan

Am Mittwoch, dem 24.05.2023 um 07:48 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> I tend to release 2.0.29 soon due to the regression which was solved
> with 
> PDFBOX-5606.
> 
> WDYT?
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.28

2023-04-10 Thread sahy...@fileaffairs.de
+1

Maruan

Am Montag, dem 10.04.2023 um 12:15 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 2.0.28 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.28/
> 
> The release candidate is a zip archive of the sources in:
> 
>  https://svn.apache.org/repos/asf/pdfbox/tags/2.0.28/
> 
> The SHA-512 checksum of the archive is 
> cae8ee30903dae6ccf9821be2ec193498de5232f71fb0ad0f8ce1b53a2aa9c64cbd01
> ca7a81b6f9eef1da4aaf5146c2b54ed3ee36c5574527b751886fdbc351e.
> 
> Please vote on releasing this package as Apache PDFBox 2.0.28.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 2.0.28
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report April 2023 due

2023-04-10 Thread sahy...@fileaffairs.de
+1

Maruan

Am Montag, dem 10.04.2023 um 17:30 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> Java library for working with PDF documents
> 
> ## Issues:
> There are no issues requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (13 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  2.0.27 was released on 2022-09-29.
>  1.8.17 was released on 2022-09-15.
>  2.0.26 was released on 2022-04-21.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - there are a lot of refactorings, improvements and bugfixes
> - we are still planning to cut the first beta release of our next
> major
>    version 3.0.0
> - we've started a vote for the 2.0.28 release
> - the new release consists of bug fixes and small improvements. One
> of the
>    more significant changes is the improved support for arabic pdfs
> - we received three reports through security@a.o. All of them are
> well known
>    and didn't qualify for a CVE due to a low severity
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.28 release?

2023-03-28 Thread sahy...@fileaffairs.de
+1
Maruan

Am Dienstag, dem 28.03.2023 um 08:46 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> how about cutting a 2.0.28 release next week on Monday?
> 
> there is a bunch of solved tickets and the last release dates back 6
> months
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report January 2023 due

2023-01-11 Thread sahy...@fileaffairs.de
+1

Maruan

Am Mittwoch, dem 11.01.2023 um 20:09 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> Java library for working with PDF documents
> 
> ## Issues:
> There are no issues requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (13 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  2.0.27 was released on 2022-09-29.
>  1.8.17 was released on 2022-09-15.
>  2.0.26 was released on 2022-04-21.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - due to the holiday season the last quarter was a little bit quieter
> than usual
> - we are going to cut the first beta release of our next major
>    version 3.0.0 this quarter
> - we are working on the 3.0 migration guide
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report October 2022 due

2022-10-10 Thread sahy...@fileaffairs.de
+1
Maruan

Am Sonntag, dem 09.10.2022 um 14:06 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> a Java library for working with PDF documents
> 
> ## Issues:
> There are no issues requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (13 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  2.0.27 was released on 2022-09-29.
>  1.8.17 was released on 2022-09-15.
>  2.0.26 was released on 2022-04-21.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - there are a lot of refactorings, improvements and bugfixes
> - we are still planning to cut the first beta release of our next
> major
>    version
>    3.0.0
> - to do so we start to identify the last tickets with breaking
> changes to be
>    included in 3.0.0.
> - due to the releases last month the preparations for the beta
> release were
>    slowed down a little
> - there was an article about maintaining interoperability in open
> source
>    software". To do so the authors studied the activities within
> Apache PDFBox
>    for two years without the knowledge of the community. We don't see
> any
>    surprises, see https://s.apache.org/aljtz for further details
> 
> 
> 
> Andreas
> 
> [1] https://reporter.apache.org/wizard/?pdfbox
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.27

2022-09-26 Thread sahy...@fileaffairs.de
+1

Maruan

Am Montag, dem 26.09.2022 um 17:28 +0200 schrieb Andreas Lehmkuehler:
> a candidate for the PDFBox 2.0.27 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.27/
> 
> The release candidate is a zip archive of the sources in:
> 
>  https://svn.apache.org/repos/asf/pdfbox/tags/2.0.27/
> 
> The SHA-512 checksum of the archive is 
> 59a5675f5d1d34f092adc019679f7d10e7e93c0f554a002ac29d48cbffcaa600d9303
> 09fa94a92191c01ead8da905cbb37ce5e233dcc9b8732a881d4abf75def.
> 
> Please vote on releasing this package as Apache PDFBox 2.0.27.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 2.0.27
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 1.8.17

2022-09-13 Thread sahy...@fileaffairs.de
+1

Thank you for taking care of the release process

Maruan

Am Montag, dem 12.09.2022 um 18:50 +0200 schrieb Andreas Lehmkuehler:
> a candidate for the PDFBox 1.8.17 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/1.8.17/
> 
> The release candidate is a zip archive of the sources in:
> 
>  https://svn.apache.org/repos/asf/pdfbox/tags/1.8.17/
> 
> The SHA-512 checksum of the archive is 
> e808b3b159b61b5928b0ad983b3bdadfc694ee80ca8a209669d591f90335165a45de6
> 84ea04b23d0a149bfc7ce5d890a287cb4e79300f3a08bb954884024c909.
> 
> Please vote on releasing this package as Apache PDFBox 1.8.17.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 1.8.17
>  [ ] -1 Do not release this package because...
> 
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Replace methods using an InputStream from Loader.loadPDF

2022-07-31 Thread sahy...@fileaffairs.de
Hi,

I'm very much in favour of simpliying as much as possible and not doing
too much magic under the hood which can be better handled individually
by a developer. This will also leave room for an individual to come up
with an optimized version for specific uses cases.

+1 from my side.

BR
Maruan


Am Sonntag, dem 31.07.2022 um 15:18 +0200 schrieb Andreas Lehmkuehler:
> Hi fellow devs,
> 
> 
> there was a discussion on JIRA [1] about the changed behaviour of the
> parser due 
> to the removal of the ScratchFileBuffer when reading a pdf.
> 
> Additionally there was the post "High memory usage with pdfbox 3" on 
> users@pdfbox targeting the very same topic
> 
> After explaining myself and my changes twice I came to conclusion
> that I'm going 
> to have to do so in the future again and again if we don't change the
> API of 
> Loader.loadPDF
> 
> People simply realize that all methods to be used for loading a pdf
> are moved 
> from PDDocument to Loader. They expect the very same behaviour when
> using a 
> similar api and that is understandable from a user point of view.
> 
> We have to remove the loadPDF variants using InputStream and replace
> them with 
> RandomAccessRead.
> 
> It it comes to InputStreams users have to decide how to procide:
> * copy the InputStream to memory by using RandomAccessReadBuffer
> * copy the InputStream to a file and use RandomAccessReadBufferedFile
> or 
> RandomAccessReadMemoryMappedFile
> 
> This would make it more transparent what happens under the hood when
> using the 
> different kinds of loadPDF methods:
> 
> * a byte array as source is already in memory and the obvious choice
> is to use 
> RandomAccessReadBuffer as a wrapper
> * a file as source targets a local file and the most obvious choice
> is to use 
> RandomAccessReadBufferedFile as a wrapper. We should document that as
> the other 
> alternative RandomAccessReadMemoryMappedFile is offered in this case
> * RandomAccessRead as source is the most obvious one and the user
> decides how to 
> create it. Additionally is ist possible to implement some own caching
> loading 
> and/or mechanism
> 
> I know, this will lead to some changes in the codebase of our users,
> but they 
> have to do it in any case as the method was moved, so why not change
> the data 
> type as well
> 
> 
> WDYT? Am I missing something?
> 
> Andreas
> 
> [1] https://issues.apache.org/jira/browse/PDFBOX-5462
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report July 2022 due

2022-07-12 Thread sahy...@fileaffairs.de


+1
Maruan

Am Montag, dem 11.07.2022 um 22:30 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> P.S.: I'm going to add some private comment about the issue we've
> discussed on 
> private@ recently before posting the report
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> Java library for working with PDF documents
> 
> ## Issues:
> There are no issues requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (13 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  3.0.0-alpha3 was released on 2022-05-05.
>  2.0.26 was released on 2022-04-21.
>  3.0.4 JBIG2 was released on 2022-03-01.
>  2.0.25 was released on 2021-12-16.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - there are a lot of refactorings, improvements and bugfixes
> - we are planning to cut the first beta release of our next major
> version
>    3.0.0
> - to do so we start to identify the last tickets with breaking
> changes to be
>    included in 3.0.0
> 
> 
> Andreas
> 
> [1] https://reporter.apache.org/wizard/?pdfbox
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: New PDFBox 3.0.0 release

2022-06-14 Thread sahy...@fileaffairs.de



Am Dienstag, dem 14.06.2022 um 08:19 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> looks like it is time for another 3.0.0 release of PDFBox. Depending
> on the 
> outcome of the next regression test I'd like to cut the next 3.0.0
> release.
> 
> Should we target another alpha or maybe the first beta?
> 
> Or are is it time for a stable 3.0.0 PDFBox release already?

I'd go for a stable release dependent on the results of the regression
tests.

BR
Maruan

> 
> WDYT?
> 
> Do you have some TODOs on your lists which have to be solved first?
> 
> I'm going to resolve my remaining 3.0.0 tickets soon
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 3.0.0-alpha3

2022-05-02 Thread sahy...@fileaffairs.de
+1
Maruan

Am Montag, dem 02.05.2022 um 19:27 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 3.0.0-alpha3 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0-alpha3/
> 
> The release candidate is a zip archive of the sources in:
> 
>  http://svn.apache.org/repos/asf/pdfbox/tags/3.0.0-alpha3/
> 
> The SHA-512 checksum of the archive is 
> 1cc2f84335745e0282cda192418d62aff0e85f3f1db8567f4484d086d02f1609beba1
> afaa348d167c07025b6fad2426a58d53e66b7c5e68e19029c1510c2966f.
> 
> Please vote on releasing this package as Apache PDFBox 3.0.0-alpha3.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 3.0.0-alpha3
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.26

2022-04-18 Thread sahy...@fileaffairs.de
+1 
Maruan

Am Montag, dem 18.04.2022 um 13:14 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 2.0.26 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.26/
> 
> The release candidate is a zip archive of the sources in:
> 
>  http://svn.apache.org/repos/asf/pdfbox/tags/2.0.26/
> 
> The SHA-512 checksum of the archive is 
> e14c57e28d10324dbcb6ad239bad5751a2dab0035bbd80427afd03f65467ec1376ddd
> 7d08e7cefd4d950b149f85d8f505f6f50cc3093fd65bb8a2cbb2b8c7c1e.
> 
> Please vote on releasing this package as Apache PDFBox 2.0.26.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 2.0.26
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report April 2022 due

2022-04-10 Thread sahy...@fileaffairs.de
+1

Maruan

Am Sonntag, dem 10.04.2022 um 12:16 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> Java library for working with PDF documents
> 
> ## Issues:
> There are no issues requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (12 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  3.0.4 JBIG2 was released on 2022-03-01.
>  2.0.25 was released on 2021-12-16.
>  3.0.0-alpha2 was released on 2021-09-10.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - there are a lot of refactorings, improvements and bugfixes
> - the release process for 2.0.26 just started
> - we are planning to cut another alpha release of our next major
> version 3.0.0
> 
> 
> 
> Andreas
> 
> [1] https://reporter.apache.org/wizard/?pdfbox
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-10 Thread sahy...@fileaffairs.de
Am Freitag, dem 11.03.2022 um 08:19 +0100 schrieb Andreas Lehmkuehler:
> Am 10.03.22 um 20:16 schrieb Tilman Hausherr:
> > I'd agree but that might mean PDFBOX-5384 wouldn't be fixed.
> It's there for quite some time and it seems to be a seldom corner
> case. IMHO it 
> can wait if we won't find a solution before Monday.
> 
> WDYT?

Don't think that this should block the release.
BR
Maruan

> 
> Andreas
> 
> > 
> > Tilman
> > 
> > Am 10.03.2022 um 19:05 schrieb Andreas Lehmkuehler:
> > > Am 09.03.22 um 17:07 schrieb Tim Allison:
> > > > All,
> > > > 
> > > > I've been out of the office for a bit and haven't caught up
> > > > yet.
> > > > Apologies if I've missed the discussion.
> > > > 
> > > > Are there plans for a 2.0.26 release?  We're probably a few
> > > > weeks out
> > > How about cutting the release next Monday?
> > > 
> > > Andreas
> > > 
> > > > from starting our next 1.x and 2.x releases on Tika, and it
> > > > would be
> > > > great to incorporate 2.0.26.  No problem at all if 2.0.26 is
> > > > slated
> > > > for later.
> > > > 
> > > > Thank you!
> > > > 
> > > > Cheers,
> > > > 
> > > >  Tim
> > > > 
> > > > On Fri, Mar 4, 2022 at 10:46 PM Tilman Hausherr
> > > >  wrote:
> > > > > 
> > > > > Am 24.02.2022 um 07:41 schrieb Andreas Lehmkuehler:
> > > > > > Am 22.02.22 um 07:49 schrieb Andreas Lehmkuehler:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I'm planning to cut a new JBIG2 release next week. There
> > > > > > > aren't that
> > > > > > > much changes but I think the fixes are worth to be
> > > > > > > released. [1]
> > > > > > I'm going to cut the release next weekend, if nobody
> > > > > > objects.
> > > > > > 
> > > > > > Once it is done we should think about a 2.0.26 release of
> > > > > > PDFBox
> > > > > 
> > > > > 
> > > > > Yes please!
> > > > > 
> > > > > Tilman
> > > > > 
> > > > > 
> > > > > -
> > > > > 
> > > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > > 
> > > > 
> > > > ---
> > > > --
> > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: COSBase, avoid to have the same hashCode for different objects holding the same value

2022-03-07 Thread sahy...@fileaffairs.de



Am Samstag, dem 05.03.2022 um 16:30 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> I'm not sure if we dicussed that topic in the past or if I simply
> mixed it up 
> with a discussion about "equals" and "="

Not sure that we discussed that in the context of primitives but having
worked on that myself a bit for the primitives there were already
hashCode/equals methods. In addition as you pointed out there are also
static instances for quite a while.

I also had to revert changes done around equals/hashCode in the past as
the current implementation of the COS type classes and (mainly)
COSWriter are dependent on keeping a specific handling.

So to me the question really is what one would expect from a PDF
perspective. If that means that we need to treat equals as being
identical then I'm fine going down that route. This would also resolve
the mutability question we had a while ago. E.g. what happens if the
COSInteger value of COSObject 100 changes but COSObject 200 points to
the same (Java) object?

Currently we have a mix which is inconsitent and the source of
surprises.

On the long run that also means that we need to look at a more
intelligent way of COSWriter as when writing a PDF we should be able to
benefit from storing "same" content only once and reference that.

BR
Maruan

> 
> However, PDFBOX-5286 shows the we have an issue with objects which
> aren't the 
> same but are treated as the same because of the same hash. This is
> true for all 
> simple objects such as COSInteger, COSFLoat, COSBoolean and COSName.
> 
> Think about the following two indirect /Length objects
> 
> 100 0 obj
> 512
> endobj
> 
> 
> 200 0 obj
> 512
> endobj
> 
> * there two different COSObjects "100 0" and "200 0"
> * both COSObjects have different hashes
> * both COSObjects are referencing a COSInteger holding the same value
> "512"
> * both COSIntegers are different objects
> * both COSIntegers have the SAME hash, as the current implementation
> of hashCode 
> is based on the value of the COSInteger
> 
> Or some pseudo code
> 
> COSObject(100,0) != COSObject(200,0)
> COSInteger(100,0) != COSInteger(200,0)
> COSObject(100,0).hashCode != COSObject(200,0).hashCode
> COSInteger(100,0).hashCode == COSInteger(200,0).hashCode
> COSInteger(100,0).equals(COSInteger(200,0) == true
> 
> IMHO we should change the implementation of hashCode so that
> different objects 
> will have different hashCodes.
> 
> I expect some side effects
> * we are using a lot of hash-based collections and I'm afraid there
> may be some 
> cases where the fact of having the same hash for different objects is
> wanted 
> (knowingly or not)
> * we have to remove the static instances for COSInteger values in a
> range from 
> -100 to 256 which will result in an increased number of COSInteger
> instances
> * there are just two static instances of COSBoolean ("true" and
> "false") which 
> have to be replaced too
> * COSName is caching a lot of values as static instances as well,
> which should 
> be removed as well
> * looks like COSFloat shouldn't be a problem
> 
> WDYT? Should we simply start with COSFloat and COSInteger and see how
> it ends up?
> 
> Andreas
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -----
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.25

2021-12-14 Thread sahy...@fileaffairs.de
+1

Maruan

Am Montag, dem 13.12.2021 um 20:02 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 2.0.25 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.25/
> 
> The release candidate is a zip archive of the sources in:
> 
>  http://svn.apache.org/repos/asf/pdfbox/tags/2.0.25/
> 
> The SHA-512 checksum of the archive is 
> e143b2a9aaa4b1f1be72e16a1c9968dacfcb3e89b4f21fdbd0580d8c9f1c9b54ee38d
> 05fe3e52ff93493c858c51090fdd8256d22153cffba1e9b523fdbd1f2f4.
> 
> Please vote on releasing this package as Apache PDFBox 2.0.25.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 2.0.25
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Code Scanning / Security Analysis on github repo

2021-12-10 Thread sahy...@fileaffairs.de
Shall we enable it (wo need to look if I have the rights to do so)? Do
we need to cast a vote for that? ...

BR
Maruan

Am Mittwoch, dem 08.12.2021 um 12:12 +0100 schrieb Tilman Hausherr:
> Yes this would be interesting. Although I expect a lot of false 
> positives, like with Sonar.
> 
> Tilman
> 
> Am 08.12.2021 um 10:14 schrieb sahy...@fileaffairs.de:
> > Dear all,
> > 
> > what about enabling code scanning / security analysis on our github
> > repo (if possible).
> > 
> > https://docs.github.com/en/code-security
> > 
> > Thoughts?
> > 
> > Maruan
> > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Code Scanning / Security Analysis on github repo

2021-12-08 Thread sahy...@fileaffairs.de
Dear all,

what about enabling code scanning / security analysis on our github
repo (if possible).

https://docs.github.com/en/code-security

Thoughts?

Maruan


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.25 Release?

2021-12-08 Thread sahy...@fileaffairs.de
+1

Maruan


Am Mittwoch, dem 08.12.2021 um 08:18 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> how about cutting a 2.0.25 release next week (Monday or Tuesday)?
> 
> It would be nice to have another one this year.
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[DISCUSS] Move trunk (3.0) to Java 11

2021-11-22 Thread sahy...@fileaffairs.de
Hi,

what about setting Java 11 as a minimum requirement for trunk, upcoming
3.0 release? Not that we have to use classes/constructs... from Java 11
but as Java 8 is outdated declaring Java 11 as a minimum requirement
will have the benefit that moving forward we won't need a major release
if we'd like to switch to Java 11 during the 3.0 livecycle.

Thoughts?

BR 
Maruan

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [DISCUSS] Performance tests for PDFBox

2021-10-18 Thread sahy...@fileaffairs.de
Am Montag, dem 18.10.2021 um 07:26 +0200 schrieb Andreas Lehmkuehler:
> Am 17.10.21 um 20:39 schrieb sahy...@fileaffairs.de:
> > Am Sonntag, dem 17.10.2021 um 12:45 +0200 schrieb Tilman Hausherr:
> > > +1
> > > 
> > > Yes this should be done, although I don't really know how. Maybe
> > > it's
> > > just windows, or maybe it's just me, but I found it difficult to
> > > get
> > > reliable, reproducable benchmarks.
> > 
> > IMHO they are reproducable within the same environment and a
> > controlled
> > workload of the system they are running on. And they only provide a
> > baseline for further analysis. OTOH they should help us finding
> > larger
> > improvements as well as degredations.
> > 
> > If we use the same set of test files at least they will also help
> > us
> > looking at performance (and memory consumption btw) from a common
> > ground.
> I agree with Maruan. Such tests don't qualify as deterministic cases
> for the 
> build process but they may give some valuable results when looking
> for 
> performance/resource issues.
> 
> > What about starting with a rendering and/or text extraction test
> > and
> > take it from there. As noted I'd see that in an extra package
> > similar
> > to examples so we can run it on a case by case basis.
> We should add the save test compressed/uncompressed as well.

I already have that as part of PDFBOX-5286 locally.

If you're fine with it I'll create a new performance subproject and add
the stuff I have. Will be handled by a new ticket.

WDYT?

BR
Maruan


> 
> Andreas
> > 
> > BR
> > Maruan
> > 
> > 
> > > 
> > > Tilman
> > > 
> > > Am 14.10.2021 um 21:21 schrieb sahy...@fileaffairs.de:
> > > > Hi,
> > > > 
> > > > given that there is PDFBOX-5286, first noted in PDFBOX-5068,
> > > > and we
> > > > also see variations in performance between releases creating a
> > > > testbed
> > > > for performance testing came to my mind. I did some very basic
> > > > tests
> > > > using JMH some of which are note in above tickets.
> > > > 
> > > > What about formalizing that? Similar to testing done by the
> > > > Tika
> > > > colleagues when it comes to text extraction.
> > > > 
> > > > Cases I see are around parsing, saving, rendering and text
> > > > extraction
> > > > and some basic workflows such as filling a form field.
> > > > 
> > > > As runtime will differ between different environments it might
> > > > be
> > > > worth
> > > > creating an extra subproject for that and run that as needed.
> > > > We
> > > > can
> > > > take the numbers from the first run and create a baseline file
> > > > from
> > > > that if we'd like to have some kind of automated comparison...
> > > > 
> > > > Having some common test will help us finding regressions
> > > > earlier
> > > > and
> > > > also help testing enhancements against a defined  set of files.
> > > > This
> > > > would complement the functionality based tests we have and also
> > > > the
> > > > larger test runs done for text extraction and rendering.
> > > > 
> > > > WDYT?
> > > > 
> > > > Maruan Sahyoun
> > > > 
> > > > 
> > > > -------
> > > > 
> > > > --
> > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDF form checkbox

2021-10-17 Thread sahy...@fileaffairs.de
Dear Craig,

try this one
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf

BR
Maruan


Am Sonntag, dem 17.10.2021 um 11:43 -0700 schrieb Craig Russell:
> Hi Tilman,
> 
> I've been unable to find the details on the 12.7.4.2.3 Check Boxes in
> the 1.7 PDF specification because of a paywall.
> 
> Is the specification for 12.7.4.2.3 available somewhere else?
> 
> Thanks,
> Craig
> 
> > On Oct 15, 2021, at 11:41 AM, Tilman Hausherr 
> > wrote:
> > 
> > Am 15.10.2021 um 20:22 schrieb Craig Russell:
> > > Hi,
> > > 
> > > I created a PDF form with a checkbox and I'm analyzing the
> > > resulting PDF.
> > > 
> > > It appears that some form fill programs resolve the checkbox with a
> > > "true/false" value while others use "Yes/No" and others "On/Off".
> > 
> > The state names should be Yes/Off (see 12.7.4.2.3 Check Boxes in the
> > 1.7 PDF specification
> > http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/adobe_supplement_iso32000.pdf
> >  ) but you can also use export values that can be chosen freely.
> > 
> > Tilman
> > 
> > 
> > 
> > > 
> > > Is there any standard for how the checkbox is supposed to be
> > > represented in the filled form?
> > > 
> > > I could not find anything about this on world wide google... ;-)
> > > 
> > > Thanks,
> > > Craig
> > > 
> > > Craig L Russell
> > > c...@apache.org
> > > 
> > > 
> > > ---
> > > --
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> Craig L Russell
> c...@apache.org
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [DISCUSS] Performance tests for PDFBox

2021-10-17 Thread sahy...@fileaffairs.de
Am Sonntag, dem 17.10.2021 um 12:45 +0200 schrieb Tilman Hausherr:
> +1
> 
> Yes this should be done, although I don't really know how. Maybe it's
> just windows, or maybe it's just me, but I found it difficult to get 
> reliable, reproducable benchmarks.

IMHO they are reproducable within the same environment and a controlled
workload of the system they are running on. And they only provide a
baseline for further analysis. OTOH they should help us finding larger
improvements as well as degredations.

If we use the same set of test files at least they will also help us
looking at performance (and memory consumption btw) from a common
ground.

What about starting with a rendering and/or text extraction test and
take it from there. As noted I'd see that in an extra package similar
to examples so we can run it on a case by case basis.

BR
Maruan 


> 
> Tilman
> 
> Am 14.10.2021 um 21:21 schrieb sahy...@fileaffairs.de:
> > Hi,
> > 
> > given that there is PDFBOX-5286, first noted in PDFBOX-5068, and we
> > also see variations in performance between releases creating a
> > testbed
> > for performance testing came to my mind. I did some very basic
> > tests
> > using JMH some of which are note in above tickets.
> > 
> > What about formalizing that? Similar to testing done by the Tika
> > colleagues when it comes to text extraction.
> > 
> > Cases I see are around parsing, saving, rendering and text
> > extraction
> > and some basic workflows such as filling a form field.
> > 
> > As runtime will differ between different environments it might be
> > worth
> > creating an extra subproject for that and run that as needed. We
> > can
> > take the numbers from the first run and create a baseline file from
> > that if we'd like to have some kind of automated comparison...
> > 
> > Having some common test will help us finding regressions earlier
> > and
> > also help testing enhancements against a defined  set of files.
> > This
> > would complement the functionality based tests we have and also the
> > larger test runs done for text extraction and rendering.
> > 
> > WDYT?
> > 
> > Maruan Sahyoun
> > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -----
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[DISCUSS] Performance tests for PDFBox

2021-10-14 Thread sahy...@fileaffairs.de
Hi,

given that there is PDFBOX-5286, first noted in PDFBOX-5068, and we
also see variations in performance between releases creating a testbed
for performance testing came to my mind. I did some very basic tests
using JMH some of which are note in above tickets.

What about formalizing that? Similar to testing done by the Tika
colleagues when it comes to text extraction.

Cases I see are around parsing, saving, rendering and text extraction
and some basic workflows such as filling a form field.   

As runtime will differ between different environments it might be worth
creating an extra subproject for that and run that as needed. We can
take the numbers from the first run and create a baseline file from
that if we'd like to have some kind of automated comparison...

Having some common test will help us finding regressions earlier and
also help testing enhancements against a defined  set of files. This
would complement the functionality based tests we have and also the
larger test runs done for text extraction and rendering.  

WDYT?

Maruan Sahyoun


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report October 2021 due

2021-10-06 Thread sahy...@fileaffairs.de
+1
Maruan

Am Dienstag, dem 05.10.2021 um 22:43 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> Java library for working with PDF documents
> 
> ## Issues:
> There are no issues requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (12 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  2.0.24 was released on 2021-06-10.
>  3.0.0-RC1 was released on 2021-04-01.
>  2.0.23 was released on 2021-03-18.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - there are a lot of refactorings, improvements and bugfixes
> - we are working on finalizing 3.0.0
> - we found a massive performance regression in 3.0.0 when working on
> the
>    optimization of incremental save
> 
> 
> 
> Andreas
> 
> [1] https://reporter.apache.org/wizard/?pdfbox
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 3.0.0-alpha2

2021-09-08 Thread sahy...@fileaffairs.de
+1

Maruan

Am Dienstag, dem 07.09.2021 um 19:20 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 3.0.0-alpha2 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0-alpha2/
> 
> The release candidate is a zip archive of the sources in:
> 
>  http://svn.apache.org/repos/asf/pdfbox/tags/3.0.0-alpha2/
> 
> The SHA-512 checksum of the archive is 
> 62e2d38066783dec4e0286842209b1ad64abe7086c97c33be40fe322e9baeae56e2d1
> 493ff39ce881962c0accbb5d234b02b570df6cc56dd165b0c749ff46bf6.
> 
> Please vote on releasing this package as Apache PDFBox 3.0.0-alpha2.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 3.0.0-alpha2
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Fwd: [NOTICE] Upcoming changes to project web sites hosted via Git

2021-08-31 Thread sahy...@fileaffairs.de
Am Dienstag, dem 31.08.2021 um 21:13 +0200 schrieb
sahy...@fileaffairs.de:
> 
> 
> Am Dienstag, dem 31.08.2021 um 12:50 +0200 schrieb Andreas Lehmkuehler:
> > Hi,
> > 
> > I've adjusted our website building process due to the announced
> > changes, see below.
> > 
> > It works but it isn't perfect. IMHO we should remove the content
> > directory from 
> > the asf-site repo so that we can hold the newly introduced .asf.yaml
> > file within 
> > the master repo.
> 
> Hi Andreas,
> 
> I don't understand why this should be done. Isn't .asf.yaml meant to be
> sitting in different branches and as we are using asf-site as the
> source for publishing there should be an .asf.yaml in there?
> 
> What did I miss?

from what I understand we can have a .asf.yaml at the master branch and
at the asf-site branch.

E.g.
https://github.com/apache/camel-website/tree/main 
https://github.com/apache/camel-website/tree/asf-site

BR
Maruan

> 
> BR
> Maruan
> 
> > 
> > 
> > Andreas
> > 
> > 
> >  Weitergeleitete Nachricht 
> > Betreff: [NOTICE] Upcoming changes to project web sites hosted via
> > Git
> > Datum: Mon, 5 Apr 2021 20:06:15 +0200
> > Von: Daniel Gruno 
> > An: Users 
> > 
> > TL;DR: If your project uses subversion or .asf.yaml for publishing
> > your project 
> > web site, you're fine. If not, please read this act before June 1st:
> > 
> > 
> > Hi folks,
> > I order to simplify workflows and the amount of services we have to
> > maintain, 
> > we've decided to phase out gitwcsub in favor of stageD. What this
> > means in plain 
> > English is that all project web sites hosted from a git repository
> > MUST have 
> > publishing settings in a .asf.yaml file for them to update web sites
> > in the future.
> > 
> > This change will take effect around June 1st, giving projects two
> > months to 
> > complete the transition. In about four weeks, we'll be sending emails
> > directly 
> > to every project that is still using the old gitwcsub methods for
> > their web 
> > sites, encouraging them to switch before we decommission the service.
> > 
> > Once the gitwcsub service has been shut off, projects that haven't
> > switched 
> > won't be able to update their web sites until they perform the
> > switch.
> > 
> > For projects wishing to proactively switch, the change is minimal.
> > All you need 
> > to do is have a file called .asf.yaml at the base of your web site
> > repository 
> > (or the branch in your git repo that hosts your web site), with the
> > following 
> > snippet inside:
> > 
> > publish:
> >    whoami: asf-site
> > 
> > 
> > This is, of course, assuming your published branch is the standard
> > asf-site 
> > branch. It could be master or something else.
> > 
> > Projects wishing to check where we are currently collecting their web
> > sites 
> > from, can go to https://infra-reports.apache.org/site-source/ and see
> > all our 
> > sources for the web sites. The column on your right will tell if you
> > are using 
> > .asf.yaml for your web site or not.
> > 
> > For more information on .asf.yaml and its features (such as staging
> > web sites 
> > etc), please see https://s.apache.org/asfyaml
> > 
> > With regards,
> > Daniel on behalf of ASF Infra.
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Fwd: [NOTICE] Upcoming changes to project web sites hosted via Git

2021-08-31 Thread sahy...@fileaffairs.de



Am Dienstag, dem 31.08.2021 um 12:50 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> I've adjusted our website building process due to the announced
> changes, see below.
> 
> It works but it isn't perfect. IMHO we should remove the content
> directory from 
> the asf-site repo so that we can hold the newly introduced .asf.yaml
> file within 
> the master repo.

Hi Andreas,

I don't understand why this should be done. Isn't .asf.yaml meant to be
sitting in different branches and as we are using asf-site as the
source for publishing there should be an .asf.yaml in there?

What did I miss?

BR
Maruan

> 
> 
> Andreas
> 
> 
>  Weitergeleitete Nachricht 
> Betreff: [NOTICE] Upcoming changes to project web sites hosted via
> Git
> Datum: Mon, 5 Apr 2021 20:06:15 +0200
> Von: Daniel Gruno 
> An: Users 
> 
> TL;DR: If your project uses subversion or .asf.yaml for publishing
> your project 
> web site, you're fine. If not, please read this act before June 1st:
> 
> 
> Hi folks,
> I order to simplify workflows and the amount of services we have to
> maintain, 
> we've decided to phase out gitwcsub in favor of stageD. What this
> means in plain 
> English is that all project web sites hosted from a git repository
> MUST have 
> publishing settings in a .asf.yaml file for them to update web sites
> in the future.
> 
> This change will take effect around June 1st, giving projects two
> months to 
> complete the transition. In about four weeks, we'll be sending emails
> directly 
> to every project that is still using the old gitwcsub methods for
> their web 
> sites, encouraging them to switch before we decommission the service.
> 
> Once the gitwcsub service has been shut off, projects that haven't
> switched 
> won't be able to update their web sites until they perform the
> switch.
> 
> For projects wishing to proactively switch, the change is minimal.
> All you need 
> to do is have a file called .asf.yaml at the base of your web site
> repository 
> (or the branch in your git repo that hosts your web site), with the
> following 
> snippet inside:
> 
> publish:
>    whoami: asf-site
> 
> 
> This is, of course, assuming your published branch is the standard
> asf-site 
> branch. It could be master or something else.
> 
> Projects wishing to check where we are currently collecting their web
> sites 
> from, can go to https://infra-reports.apache.org/site-source/ and see
> all our 
> sources for the web sites. The column on your right will tell if you
> are using 
> .asf.yaml for your web site or not.
> 
> For more information on .asf.yaml and its features (such as staging
> web sites 
> etc), please see https://s.apache.org/asfyaml
> 
> With regards,
> Daniel on behalf of ASF Infra.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.24

2021-06-08 Thread sahy...@fileaffairs.de
+1

Thanks for preparing the release.

Maruan

Am Montag, dem 07.06.2021 um 18:51 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 2.0.24 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.24/
> 
> The release candidate is a zip archive of the sources in:
> 
>  http://svn.apache.org/repos/asf/pdfbox/tags/2.0.24/
> 
> The SHA-512 checksum of the archive is 
> 5d55b3cadbbae266d90c47f5b10c9b09b6dc16f53b77a0cf15c78e62fc69afc7b6eab
> 5a4329608ecdf25de9194b38db1f7d23e7d71af473cc1bf7b09b0028642.
> 
> Please vote on releasing this package as Apache PDFBox 2.0.24.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 2.0.24
>  [ ] -1 Do not release this package because...
> 
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Retire Subproject Preflight

2021-05-27 Thread sahy...@fileaffairs.de
+1 

BR
Maruan

Am Donnerstag, dem 27.05.2021 um 08:33 +0200 schrieb Andreas
Lehmkuehler:
> Hi,
> 
> a discussion came up on dev@pdfbox [1] to retire Preflight and I had
> the 
> impression that we already reached consensus to do so.
> 
> I'd like to run a formal vote so that this topic won't get lost in
> some mailing 
> list thread.
> 
> Please vote on retiring the subproject Preflight with Apache PDFBox
> 4.0.0.
> The vote is open for the next 7 days and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Remove Preflight with Apache PDFBox 4.0.0
>  [ ] -1 Do not remove Preflight because...
> 
> Here is my +1
> 
> Andreas
> 
> P.S.: I've extended the voting period to 7 days to ensure that
> everybody has a 
> chance to think about it and speak up if necessary.
> 
> 
> [1] 
> https://lists.apache.org/thread.html/r8abffe02ff4a94be93b7799b589532dc2a3384d6c5cd727bc388250a%40%3Cdev.pdfbox.apache.org%3E
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 3.0 Top 5

2021-04-21 Thread sahy...@fileaffairs.de
Dear fellow Dev colleagues,

I'd like to ask you to send me what you belive are the top 5
enhancements to PDFBox with the forthcoming 3.0 release. I'd like to
use that as a starting point for our documentation as an intro article.

With kind regards 

Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 3.0.0-RC1

2021-04-01 Thread sahy...@fileaffairs.de
+1 
Maruan

Am Montag, dem 29.03.2021 um 19:08 +0200 schrieb Andreas Lehmkuehler:
> Hi,
> 
> a candidate for the PDFBox 3.0.0-RC1 release is available at:
> 
>  https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0-RC1/
> 
> The release candidate is a zip archive of the sources in:
> 
>  http://svn.apache.org/repos/asf/pdfbox/tags/3.0.0-RC1/
> 
> The SHA-512 checksum of the archive is 
> b4ed9fec1d5e86422452bda3d9ec66206aa665277d4aebe1e7053a0ef38de211d8440
> 375bcaf05a4a5c0070d2bdfa9d30df94df2c128f6c15c8fb5b008550987.
> 
> Please vote on releasing this package as Apache PDFBox 3.0.0-RC1.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 PDFBox PMC votes are cast.
> 
>  [ ] +1 Release this package as Apache PDFBox 3.0.0-RC1
>  [ ] -1 Do not release this package because...
> 
> Here is my +1
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [DISCUSS] XMPBox

2021-03-28 Thread sahy...@fileaffairs.de
Am Sonntag, dem 28.03.2021 um 18:47 +0200 schrieb Tilman Hausherr:
> Am 28.03.2021 um 18:44 schrieb sahy...@fileaffairs.de:
> > Am Sonntag, dem 28.03.2021 um 16:36 +0200 schrieb Tilman Hausherr:
> > > I don't have an opinion on XMP because I don't use it.
> > As XMP is needed for getting/setting metadata esp. since PDF 2.0
> > there
> > needs to be support for it - not neccesarily from us directly i.e.
> > we
> > could integrate a different lib.
> > 
> > I'll revert the work done in PDFBOX-5128 and we get back to it
> > after
> > 3.0 - WDYT?
> 
> 
> No, why revert? As far as I understand it, it makes possible that
> XMPs 
> with non standard schemas can still be parsed so that people can 
> retrieve the standard stuff, so that is very useful.

it's still very limited - I can keep it but as long as the XMP doesn't
conform to the (strict) initial parsing rules it will still fail. The
idea to revert was because of getting time to work on it (if we decide
to do so) or otherwise keep it in the state it has been before i.e.
targeted to PDF/A-1 conforming XMPs.

BR
Maruan

> 
> Tilman
> 
> 
> 
> > 
> > BR
> > Maruan
> > 
> > > Re preflight, I agree with you. It was great but it has hit a
> > > dead end,
> > > and VeraPDF is better because it is more flexible.
> > 
> > > Tilman
> > > 
> > > Am 28.03.2021 um 15:52 schrieb Andreas Lehmkuehler:
> > > > Am 28.03.21 um 15:00 schrieb sahy...@fileaffairs.de:
> > > > > Fellow colleagues,
> > > > > 
> > > > > there was some discussion about the ability of XMPBox to
> > > > > parse
> > > > > arbritary XMP which lead to PDFBOX-5128.
> > > > > 
> > > > > Now, after digging into the code and after reading through
> > > > > the
> > > > > various
> > > > > specs for XMP and PDF/A as it stands now XMPBox in it's
> > > > > current
> > > > > implementation is too restricted from the start as it not
> > > > > only per
> > > > > default (although there is a way around it) only supports
> > > > > parsing
> > > > > predefined XMP schemas restricted to the ones defined in
> > > > > PDF/A-1
> > > > > but
> > > > > also does some validation in the parsing phase.
> > > > Exactly the point where I stopped some time ago, when trying to
> > > > just
> > > > expand the parser ;-)
> > > > 
> > > > 
> > > > > Now, in order to get to an implementation for arbritary XMP
> > > > > that
> > > > > needs
> > > > > to change with the validation for PDF/A-1 put on top. We
> > > > > could use
> > > > > the
> > > > > existing implementation in a generalized way, use an existing
> > > > > Java
> > > > > XMP
> > > > > parser such as Adobes XMPCore or approach it in a layered
> > > > > fashion
> > > > > XML -
> > > > > > RDF -> XMP with supporting libs for that.
> > > > > The other option would be to keep XMPBox as is and for
> > > > > general
> > > > > purpose
> > > > > add a general parser into the project or simply refer to
> > > > > XMPCore.
> > > > > 
> > > > > That leads me to the question about the benefit of having a
> > > > > general
> > > > > purpose (ASL licensed) XMP lib as part of PDFBox? Thoughts?
> > > > It replaced JempBox when preflight was added to PDFBox, saying
> > > > that,
> > > > it was a more or less historical reason.
> > > > 
> > > > I myself never needed that XMP-stuff. It is used by TIKA and
> > > > preflight
> > > > and maybe others.
> > > > 
> > > > I have to admit that I already thought about the future of
> > > > preflight.
> > > > I've planned to come up with that topic after releasing 3.0.0,
> > > > but
> > > > why
> > > > waiting.
> > > > 
> > > > Preflight is part of PDFBox but is practically not maintained.
> > > > Preflight support is limited to A1B and I don't see anybody who
> > > > plans
> > > > to extend it. VeraPDF has a lot more to offer and is open
> > > > source as
> > > > well, so maybe a better alternative ...
> > > > 
> > >

Re: [DISCUSS] XMPBox

2021-03-28 Thread sahy...@fileaffairs.de
Am Sonntag, dem 28.03.2021 um 16:36 +0200 schrieb Tilman Hausherr:
> I don't have an opinion on XMP because I don't use it.

As XMP is needed for getting/setting metadata esp. since PDF 2.0 there
needs to be support for it - not neccesarily from us directly i.e. we
could integrate a different lib. 

I'll revert the work done in PDFBOX-5128 and we get back to it after
3.0 - WDYT?

BR
Maruan

> 
> Re preflight, I agree with you. It was great but it has hit a dead end,
> and VeraPDF is better because it is more flexible.


> 
> Tilman
> 
> Am 28.03.2021 um 15:52 schrieb Andreas Lehmkuehler:
> > Am 28.03.21 um 15:00 schrieb sahy...@fileaffairs.de:
> > > Fellow colleagues,
> > > 
> > > there was some discussion about the ability of XMPBox to parse
> > > arbritary XMP which lead to PDFBOX-5128.
> > > 
> > > Now, after digging into the code and after reading through the
> > > various
> > > specs for XMP and PDF/A as it stands now XMPBox in it's current
> > > implementation is too restricted from the start as it not only per
> > > default (although there is a way around it) only supports parsing
> > > predefined XMP schemas restricted to the ones defined in PDF/A-1
> > > but
> > > also does some validation in the parsing phase.
> > Exactly the point where I stopped some time ago, when trying to just 
> > expand the parser ;-)
> > 
> > 
> > > Now, in order to get to an implementation for arbritary XMP that
> > > needs
> > > to change with the validation for PDF/A-1 put on top. We could use
> > > the
> > > existing implementation in a generalized way, use an existing Java
> > > XMP
> > > parser such as Adobes XMPCore or approach it in a layered fashion
> > > XML -
> > > > RDF -> XMP with supporting libs for that.
> > > 
> > > The other option would be to keep XMPBox as is and for general
> > > purpose
> > > add a general parser into the project or simply refer to XMPCore.
> > > 
> > > That leads me to the question about the benefit of having a general
> > > purpose (ASL licensed) XMP lib as part of PDFBox? Thoughts?
> > It replaced JempBox when preflight was added to PDFBox, saying that, 
> > it was a more or less historical reason.
> > 
> > I myself never needed that XMP-stuff. It is used by TIKA and
> > preflight 
> > and maybe others.
> > 
> > I have to admit that I already thought about the future of preflight.
> > I've planned to come up with that topic after releasing 3.0.0, but
> > why 
> > waiting.
> > 
> > Preflight is part of PDFBox but is practically not maintained. 
> > Preflight support is limited to A1B and I don't see anybody who plans
> > to extend it. VeraPDF has a lot more to offer and is open source as
> > well, so maybe a better alternative ...
> > 
> > How about removing preflight with 4.0.0? This would remove the one
> > and 
> > only hard dependency of XMPBox, so that it would be easier to decide 
> > if we really need to maintain out own XMP lib.
> > 
> > 
> > Andreas
> > 
> > -----
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [DISCUSS] XMPBox

2021-03-28 Thread sahy...@fileaffairs.de
quick addition - I'm happy to put the work into that if we think it's
worth the effort.

Maruan

Am Sonntag, dem 28.03.2021 um 15:00 +0200 schrieb
sahy...@fileaffairs.de:
> Fellow colleagues,
> 
> there was some discussion about the ability of XMPBox to parse
> arbritary XMP which lead to PDFBOX-5128.
> 
> Now, after digging into the code and after reading through the
> various
> specs for XMP and PDF/A as it stands now XMPBox in it's current
> implementation is too restricted from the start as it not only per
> default (although there is a way around it) only supports parsing
> predefined XMP schemas restricted to the ones defined in PDF/A-1 but
> also does some validation in the parsing phase.
> 
> Now, in order to get to an implementation for arbritary XMP that
> needs
> to change with the validation for PDF/A-1 put on top. We could use
> the
> existing implementation in a generalized way, use an existing Java
> XMP
> parser such as Adobes XMPCore or approach it in a layered fashion XML
> -
> > RDF -> XMP with supporting libs for that.
> 
> The other option would be to keep XMPBox as is and for general
> purpose
> add a general parser into the project or simply refer to XMPCore.
> 
> That leads me to the question about the benefit of having a general
> purpose (ASL licensed) XMP lib as part of PDFBox? Thoughts?
> 
> BR    
>  

-- 
-- 
Maruan Sahyoun



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[DISCUSS] XMPBox

2021-03-28 Thread sahy...@fileaffairs.de
Fellow colleagues,

there was some discussion about the ability of XMPBox to parse
arbritary XMP which lead to PDFBOX-5128.

Now, after digging into the code and after reading through the various
specs for XMP and PDF/A as it stands now XMPBox in it's current
implementation is too restricted from the start as it not only per
default (although there is a way around it) only supports parsing
predefined XMP schemas restricted to the ones defined in PDF/A-1 but
also does some validation in the parsing phase.

Now, in order to get to an implementation for arbritary XMP that needs
to change with the validation for PDF/A-1 put on top. We could use the
existing implementation in a generalized way, use an existing Java XMP
parser such as Adobes XMPCore or approach it in a layered fashion XML -
> RDF -> XMP with supporting libs for that.

The other option would be to keep XMPBox as is and for general purpose
add a general parser into the project or simply refer to XMPCore.

That leads me to the question about the benefit of having a general
purpose (ASL licensed) XMP lib as part of PDFBox? Thoughts?

BR
 
-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.0 RC1 ??

2021-03-21 Thread sahy...@fileaffairs.de
+1

Maruan

Am Sonntag, dem 21.03.2021 um 14:35 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> I'd like to cut a first release candidate for 3.0.0 from the trunk.
> That RC is 
> meant to be feature ready but there may be some API changes if
> necessary when 
> fixing outstanding bugs or other issue we'd like to fix before
> releasing a final 
> version.
> 
> How about cutting the first release candidate in about week from now?
> 
> 
> Cheers
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.22 vs 2.0.23

2021-03-12 Thread sahy...@fileaffairs.de
Am Freitag, dem 12.03.2021 um 08:15 -0500 schrieb Tim Allison:
> > would it make sense to add that support? If yes could we get samples
> > of
> > various schema to support that development? Could look into that if
> > we
> > think that's worth the effort
> 
> I think I can find some XMPs if they'd be of any use! :D

That would be great - maybe together with expected extraction results -
so I can start with proper unit tests. If you could add to

https://issues.apache.org/jira/browse/PDFBOX-5128

that would be great.

BR

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.22 vs 2.0.23

2021-03-11 Thread sahy...@fileaffairs.de
Am Donnerstag, dem 11.03.2021 um 07:56 +0100 schrieb Tilman Hausherr:
> Am 11.03.2021 um 07:46 schrieb Andreas Lehmkuehler:
> > Am 11.03.21 um 07:24 schrieb Tilman Hausherr:
> > > new report
> > > http://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.22_vs_2.0.23_5.tar.xz
> > > 
> > > The content differences part is now the smallest ever, likely due
> > > to 
> > > my change in tika-eval (TIKA-3314) and restoring a PDFBox code 
> > > segment I accidentally deleted (PDFBOX-5115).
> > Cool!!
> > 
> > > There are three new exceptions. Two are in jempbox and one is in
> > > tika 
> > > itself so I suspect PDFBox isn't to blame. I'll look at it too if
> > > I 
> > > have the time.
> > As far as I remember the jempbox issue isn't new, Tim mentioned it 
> > some time ago. Just out of curiosity does it make sense to use an
> > old 
> > lib to extract metadata? Is there anything missing in xmpbox but 
> > available in jempbox?
> > 
> The three new exceptions weren't in earlier reports.
> 
> IIRC the reason Tika uses Jempbox is because Xmpbox fails when there
> is 
> a non standard schema.

would it make sense to add that support? If yes could we get samles of
various schema to support that development? Could look into that if we
think that's worth the effort

Maruan


> 
> Tilman
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Status of remaining PDFBox 3.0.0 tickets

2021-01-13 Thread sahy...@fileaffairs.de
Hi,

Am Donnerstag, dem 14.01.2021 um 08:10 +0100 schrieb Andreas
Lehmkuehler:
> Hi,
> 
> there are just a handful of open tickets left aiming our upcoming
> 3.0.0 release 
> [1]. It looks like all of them had some progress but I'm unable to
> decide if 
> they are done or not.
> 
> Is there any chance to update their status? Possible TODOs could be
> postpone by 
> creating new tickets.
> 
> @Maruan, @Tilman
> Can you have a look at the acroform/signing tickets, please?

The AcroForm related I've done - for the signing ones I can't comment. 

> 
> I tend to resolve [2] and create a new ticket for the remaining TODOs
> (mostly 
> refactoring)
> 
> I guess the migration guide won't be complete until we release the
> final version.

this is an ongoing effort. I expect work to continue after releasing
the final version in order to incorporate users feedback.

BR
Maruan


> 
> Andreas
> 
> [1] https://s.apache.org/bhq1e
> [2] https://issues.apache.org/jira/browse/PDFBOX-4952
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report January 2021 due

2021-01-12 Thread sahy...@fileaffairs.de
+1

sorry missed that.
BR
Maruan

Am Samstag, dem 09.01.2021 um 13:28 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to
> submit this
> month. It's based upon the report wizard template which can be found
> at [1]
> 
> Any comments or additions are appreciated ...
> 
> 
> 
> ## Description:
> The mission of PDFBox is the creation and maintenance of software
> related to
> Java library for working with PDF documents
> 
> ## Issues:
> There are no issue requiring board attention at this time.
> 
> ## Membership Data:
> Apache PDFBox was founded 2009-10-21 (11 years ago)
> There are currently 21 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
> - No new committers. Last addition was Joerg O. Henne on 2017-10-09.
> 
> ## Project Activity:
> Recent releases:
> 
>  2.0.22 was released on 2020-12-19.
>  2.0.21 was released on 2020-08-20.
>  2.0.20 was released on 2020-05-07.
> 
> ## Community Health:
> - there is a steady stream of contributions, bug reports and
> questions on the
>    mailing lists
> - there are a lot of refactorings, improvements and bugfixes
> - we started finalizing the next major release 3.0.0. We expect to
> prepare a
>    first release candidate soon
> 
> 
> 
> Andreas
> 
> [1] https://reporter.apache.org/wizard/?pdfbox
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Xmas 2.0.22 Release?

2020-12-13 Thread sahy...@fileaffairs.de
Am Sonntag, dem 13.12.2020 um 15:37 +0100 schrieb Tilman Hausherr:
> new report available at
> http://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.21_vs_2.0.22.tar.xz
> No more new exceptions.
> 
> What we should work on - doesn't have to be now - is on the
> exceptions 
> mentioned in stack_traces_by_mime_B. There are many uncaught
> exceptions 
> there.

How do I get the information which file is causing this?

> 
> NegativeArraySizeException
> IllegalArgumentException
> NullPointerException
> StackOverflowError (!)
> ArrayIndexOutOfBoundsException
> ClassCastException
> IndexOutOfBoundsException
> StringIndexOutOfBoundsException
> 
> It's possible that we never noticed these because the test set was 
> expanded and we always looked at the differences, except the very
> first 
> time long ago when there was only digitalcorpora.
> 
> Tilman
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: code coverage

2020-12-12 Thread sahy...@fileaffairs.de
Am Samstag, dem 12.12.2020 um 18:11 +0100 schrieb Tilman Hausherr:
> We have reached 50% on code coverage 
yes - but looking at the new code we are getting better (maybe not for
code coverage that much) but 
Bugs, Vulnerabilities, Security Hotspots are now at  0 - which hasn't
been the case 2-3 weeks ago.

The whole idea is to also look at the new code and make sure that we
meet the targets there.




-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


Re: Xmas 2.0.22 Release?

2020-12-11 Thread sahy...@fileaffairs.de
Am Freitag, den 11.12.2020, 14:58 +0100 schrieb Tilman Hausherr:
> The exceptions are mostly about the acroform fixup.
> This fails when the font can't be used.
> 
> bug_trackers/PDFBOX/PDFBOX-4086-0.pdf
> bug_trackers/PDFBOX/PDFBOX-4086-1.pdf
> bug_trackers/PDFBOX/PDFBOX-4086-2.pdf
> bug_trackers/PDFBOX/PDFBOX-3587-0.zip-5.pdf
> bug_trackers/PDFBOX/PDFBOX-3642-0.pdf

they should be fixed now.

> 
> 
> However I wonder if Tika should also be changed: it doesn't need the 
> appearances for text extraction. However it could use the field
> repair.

would be benefitial - that's also the reason why there are multiple
processors with a single purpose.

> 
> Tilman
> 
> 
> Am 11.12.2020 um 13:07 schrieb Tilman Hausherr:
> > I had a quick look
> > - 32 new exceptions
> > - content is a bit better, for NUM_COMMON_TOKENS the new version 
> > extracts 100.41% of the old one.
> > 
> > Tilman
> > 
> > Am 11.12.2020 um 13:04 schrieb Tilman Hausherr:
> > >  
> > > http://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.21_vs_2.0.22.tar.xz
> > >  
> > 
> > 
> > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Xmas 2.0.22 Release?

2020-12-03 Thread sahy...@fileaffairs.de
Am Donnerstag, den 03.12.2020, 17:46 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> looks like there are again a lot of unreleased fixes/improvements.
> How about 
> cutting 2.0.22 in about 1-2 week from now?

+1

> 
> @Tim or Tilman
> Does anyone of you have the time to run the test arena?

Would it make sense to do an initial test of 3.0?

> 
> WDYT?
> 
> Andreas
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [DISCUSS] PDFBox 3.0.0

2020-12-02 Thread sahy...@fileaffairs.de
getting used to use
> > > git/github instead of 
> > > svn + patches.
> > > 
> > > 
> > > Andreas
> > > 
> > > > BR
> > > > Maruan
> > > > 
> > > > 
> > > > 
> > > > 
> > > > ---
> > > > --
> > > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 

-- 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahy...@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [Heads-Up] Documentation

2020-11-24 Thread sahy...@fileaffairs.de
Am Dienstag, den 24.11.2020, 08:25 +0100 schrieb Andreas Lehmkuehler:
> Am 22.11.20 um 21:19 schrieb sahy...@fileaffairs.de:
> > Dear Dev team,
> > 
> > in order to provide a base to slowly enhance our documentation I'm
> > currently working on an addition to our site generator which
> > already
> > works in my local repo. This will allow to add code snippets from
> > our
> > examples into the generated docs. To use it the following code
> > needs to
> > be put into a document where the code shall appear (as an example
> > I'm
> > using a reference to the CreateCheckBox.java example for current
> > trunk.
> > 
> > ``` java
> > {% codesnippet 'interactive/form/CreateCheckBox.java' 'trunk' %}
> > ```
> > 
> > In addition - in order to be able to only put parts of the code
> > into
> > the documentation the following comments can be added to the java
> > code
> > 
> > //DOC-START
> > ...
> > //DOC-END
> > 
> > The DOC-START/DOC-END pair can be placed multiple times into the
> > java
> > code. Everything between these special comment lines will be added
> > the
> > other content will be omitted. This will allow us to skip license
> > header, import statements etc. to concentrate on the important
> > bits.
> > 
> > This way we have the benefit of testable code but also the ability
> > to
> > reuse that in our docs.
> > 
> > WDYT?
> I like the idea, thanks for the effort.
> 
> Just out of curiosity, how does the process work? Do those pages
> include the 
> code snippets dynamically or are the pages still static, so that we
> have to 
> regenerate the website after each change within the relevant code
> pieces?

the code snippets are embedded when the site is generated i.e. not
fetched at runtime. Fetching at runtime would be doable of course.
Given that when we do a release the examples don't change anymore for
that release I think the static approach is suitable.

BR
Maruan

> 
> Andreas
> > 
> > BR
> > Maruan
> > 
> > 
> > 
> > 
> > 
> > ---
> > --
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[Heads-Up] Documentation

2020-11-22 Thread sahy...@fileaffairs.de
Dear Dev team,

in order to provide a base to slowly enhance our documentation I'm
currently working on an addition to our site generator which already
works in my local repo. This will allow to add code snippets from our
examples into the generated docs. To use it the following code needs to
be put into a document where the code shall appear (as an example I'm
using a reference to the CreateCheckBox.java example for current trunk.

``` java
{% codesnippet 'interactive/form/CreateCheckBox.java' 'trunk' %}
```

In addition - in order to be able to only put parts of the code into
the documentation the following comments can be added to the java code

//DOC-START
...
//DOC-END

The DOC-START/DOC-END pair can be placed multiple times into the java
code. Everything between these special comment lines will be added the
other content will be omitted. This will allow us to skip license
header, import statements etc. to concentrate on the important bits.

This way we have the benefit of testable code but also the ability to
reuse that in our docs.

WDYT?

BR
Maruan





-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Unicode codepoint conversions

2020-11-18 Thread sahy...@fileaffairs.de


Am Mittwoch, den 18.11.2020, 13:58 +0200 schrieb Constantine Dokolas:
> I noticed that writing some codepoints to a PDF and then reading back
> the
> text from the generated PDF (via PDFTextStripper), I see some
> conversions
> happening. For example, the simple hyphen character (0x2D, "HYPHEN-
> MINUS")
> gets converted to a non-breaking hyphen (0x2011, "NON-BREAKING
> HYPHEN").
> 
> Since I'm writing unit tests to verify that everything gets written
> correctly in the PDF from my end (PDF generation), I need to know
> why, when
> and how these conversions take place (I first noticed them while
> writing
> some CJK codepoints). Any suggestions/pointers?
> 

Could you share a code snippet how you are writing/retrieving the data.

BR
Maruan 

> Constantine
> --
> There is a computer disease that anybody who works with computers
> knows
> about. It's a very serious disease and it interferes completely with
> the
> work. The trouble with computers is that you 'play' with them!
> - Richard P. Feynman



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: ossindex-maven-plugin and build issue

2020-11-15 Thread sahy...@fileaffairs.de
Can we specify a different JUnit version e.g. JUnit 5 or isn't that
supported by our build environment?

Am Sonntag, den 15.11.2020, 14:19 +0100 schrieb Andreas Lehmkuehler:
> Hi,
> 
> the ossindex-maven-plugin triggers the recent build issue. It
> complains about 
> the junit plugin because of a vulnerability see CVE-2020-15250.
> 
> Any idea on how to solve that, without deactivating the plugin?
> 
> Andreas
> 
> https://nvd.nist.gov/vuln/detail/CVE-2020-15250
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org