Re: PDFBox 4.0 and development plans

2023-10-19 Thread Andreas Lehmkühler



@Maruan, thanks for starting this initative :-)



Am 11.10.23 um 07:53 schrieb sahy...@fileaffairs.de:

Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0)

Sounds good to me.



Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.
I agree, we need some sort of plan for the next version to avoid another 
big bang release. I don't have to be that formal, but we shall agree on 
bigger changes to be added to the next major release



Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
IMHO XMP-support is not essential but optional so that it is a good idea 
to use some existing lib instead of implementing our own one.



- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)

That might be a big thing ...


- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)

I'm not sure which details maybe be important, but let us start a discussion


- review and add some more PDF 2.0 capabilities

In most cases this can be done in little steps


- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox

Good ideas as well 


...

That list is already long and I think would be too much given above
idea of release planning.

;-)


With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.
First of all, what is your definition of a LTS version? Of course is a 
long term version, but what is long and when does such version reach EOL?


Why did you choose 2.0 as LTS? 2.0.0 was released in 2016, doesn't that 
already qualify as LTS? 2.0 requires java 6, a very old version.
Why not choose 3.0 as LTS? It requires java 8, a more or less old 
version but still widely used and the last version before they start 
removing apis. 3.0 is the last version including preflight.
We should discuss that in a separate thread, juts wanted to share my 
thoughts as a starter





Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-14 Thread sahy...@fileaffairs.de
Am Samstag, dem 14.10.2023 um 13:36 +0200 schrieb Tilman Hausherr:
> On 11.10.2023 07:53, sahy...@fileaffairs.de wrote:
> > With regards to versioning I'd like to propose that we have 2.0 as
> > LTS 
> > and 4.x being the next LTS.

although both projects can not be compared maybe we can adopt something
similar
https://camel.apache.org/blog/2020/03/LTS-Release-Schedule/

e.g. we do a 4.0.0 as LTS
4.1.0, 4.2.0 ... will not be 4.3.0 might 

Where 4.x are feature releases with faster increments to what we do
today. E.g. we might not implement all PDF 2.0 bits in 4.0 but maybe
only for certain parts like annotations, or forms or signature algos ..
and then add the next bits in 4.1 and so on planning the boundaries
upfront.

This way instead of doing a big release like 3.0 after a very long
period of time we have faster cycles. 

Bug fixes will only be done to LTS e.g. in my sample above 4.0 will
have patch releases like 4.0.1, 4.0.2 ... but 4.1 will not. We can also
limit the lifetime for LTS to a certain (not too long) period of time.
Depends on how fast we think me can do minor releases. LTS would
receive bug fixes only - no new features. So for users with less
critical applications they can adopt quicker. And because we have less
new stuff we limit the risk of breaking things.

>From that perspective we need to decide for 4.0 what will be breaking
changes. Others can be done either in 4.0 or later. E.g. apparance
handlers for form widgets can be done in a non breaking but additive
manner.

BR
Maruan


> 
> Yes, very good idea.
> 
> Tilman
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-14 Thread Tilman Hausherr

On 11.10.2023 07:53, sahy...@fileaffairs.de wrote:
With regards to versioning I'd like to propose that we have 2.0 as LTS 
and 4.x being the next LTS.


Yes, very good idea.

Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-14 Thread Tilman Hausherr
XMP, incremental versions record keeping, 2.0 features sound interesting 
to me.


My thoughts are mostly about existing issues... these are the ones that 
people have spoken out about.


Tilman

On 11.10.2023 07:53, sahy...@fileaffairs.de wrote:

Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0)

Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.

Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)
- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)
- review and add some more PDF 2.0 capabilities
- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox
...

That list is already long and I think would be too much given above
idea of release planning.

With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.

Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-13 Thread sahy...@fileaffairs.de
Am Freitag, dem 13.10.2023 um 08:23 +0200 schrieb Andreas Lehmkühler:
> 
> 
> Am 13.10.23 um 04:40 schrieb axh:
> > Hi,
> > 
> > I suggest to also revisit logging. Last week I opened an issue for
> > that (PDFBOX-5695
> > ), but it seems
> > everybody is tired by this subject and no none even looked at it.
> > Nonetheless, please take a look. The last time a switch to a
> > logging facade was proposed (and rejected) has been 10 years ago. I
> > think it is worth reconsidering, and a new major release would be
> > the right time to do a change like that. More details in the issue.
> > 
> Please don't give upto early on us. We are all volunteers with
> limited 
> time and different priorities.
> 
> > Whatever the project decides, I am willing to contribute the
> > required patch(es).
> We highly appreciate that.
> 
> I personally don't have the pressure to switch the logging framework
> but 
> I see it is long overdue to overhauil that part of PDFBox.
> 
> I tend to agree with Tilman and I'd like to use log4j2. I hope I'll
> find 
> some time to comment on your proposal at the next weekend.

+1 to switch to log4j2. Benefits described in the ticket. log4j vs
slf4j -> because of Apache License and "family" although license wise
MIT would be compliant AFAIU. 

Maruan 

> 
> 
> Andreas
> 
> > 
> > Cheers,
> > Axel
> > 
> > > Am 11.10.2023 um 07:53 schrieb sahy...@fileaffairs.de:
> > > 
> > > Dear colleagues,
> > > 
> > > with 3.0 being released and 4.0 being started I'd like to start
> > > discussing what the major plans are for 4.0. And maybe in a way
> > > that
> > > the release can be made faster than what we had for 3.0. (maybe
> > > size it
> > > in a way that we can do the dev stuff by spring 2024 and then
> > > release
> > > in summer 2024 followed by a 4.1 release to add to that instead
> > > of
> > > doing a big bang like 3.0)
> > > 
> > > Shall we share some ideas via the mailing list or start a page on
> > > our
> > > website (I think ml is easier to do). We can still document the
> > > major
> > > initiatives as soon as we have agreed in a blog post.
> > > 
> > > Here are my current thoughts (some of which might also be
> > > backported to
> > > 3.0) in no particular order
> > > 
> > > - appareance stream handlers for interactive form widgets
> > > (similar to
> > > what we have for annotations) also allowing one to add their own
> > > handler
> > > - replacement or at least new base for XMPBox (current thought is
> > > to
> > > have a new base parser and add if possible XMPBox current end
> > > user api
> > > on top - might be able to reuse xmlgraphics XMP lib). Would allow
> > > to
> > > better deal with XMPs which are not standard and make it easier
> > > to add
> > > to existing XMPs low level.
> > > - then we had the discussion about an event handler/listener
> > > similar to
> > > what fop provides so one can listen to corrections/repairs done
> > > under
> > > the hood (I know that we can only lay the ground for that as this
> > > is a
> > > major undertaking given all the places where we correct things)
> > > - enhance the parsing to keep the information about incremental
> > > versions (better debugging, trace of changes done ...)
> > > - review and add some more PDF 2.0 capabilities
> > > - better text formatting/language support (maybe by including fop
> > > parts
> > > or looking into using HarfBuzz)
> > > - I'd also like to discuss reaching out to fop to look at
> > > integrating
> > > some of their font handling into fontbox
> > > ...
> > > 
> > > That list is already long and I think would be too much given
> > > above
> > > idea of release planning.
> > > 
> > > With regards to versioning I'd like to propose that we have 2.0
> > > as LTS
> > > and 4.x being the next LTS.
> > > 
> > > Thoughts
> > > BR
> > > Maruan
> > > 
> > > 
> > > 
> > > -
> > > 
> > > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > > 
> > 
> > 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-13 Thread Andreas Lehmkühler




Am 13.10.23 um 04:40 schrieb axh:

Hi,

I suggest to also revisit logging. Last week I opened an issue for that (PDFBOX-5695 
), but it seems everybody is 
tired by this subject and no none even looked at it. Nonetheless, please take a look. 
The last time a switch to a logging facade was proposed (and rejected) has been 10 
years ago. I think it is worth reconsidering, and a new major release would be the 
right time to do a change like that. More details in the issue.

Please don't give upto early on us. We are all volunteers with limited 
time and different priorities.



Whatever the project decides, I am willing to contribute the required patch(es).

We highly appreciate that.

I personally don't have the pressure to switch the logging framework but 
I see it is long overdue to overhauil that part of PDFBox.


I tend to agree with Tilman and I'd like to use log4j2. I hope I'll find 
some time to comment on your proposal at the next weekend.



Andreas



Cheers,
Axel


Am 11.10.2023 um 07:53 schrieb sahy...@fileaffairs.de:

Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0)

Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.

Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)
- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)
- review and add some more PDF 2.0 capabilities
- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox
...

That list is already long and I think would be too much given above
idea of release planning.

With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.

Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org






-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-12 Thread Tilman Hausherr

On 13.10.2023 04:40, axh wrote:
but it seems everybody is tired by this subject and no none even 
looked at it.


I did... but it's really not the kind of thing I have a strong opinion 
about. I'm happy the way it is, I use the same at work. If we ever 
switch then we should use log4j because it's "Apache family".


I'll reread the issue this weekend and think about what Maruan wrote 
too. (I just woke up so I can't really think at all)


Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-12 Thread axh
Hi,

I suggest to also revisit logging. Last week I opened an issue for that 
(PDFBOX-5695 ), but it seems 
everybody is tired by this subject and no none even looked at it. Nonetheless, 
please take a look. The last time a switch to a logging facade was proposed 
(and rejected) has been 10 years ago. I think it is worth reconsidering, and a 
new major release would be the right time to do a change like that. More 
details in the issue.

Whatever the project decides, I am willing to contribute the required patch(es).

Cheers,
Axel 

> Am 11.10.2023 um 07:53 schrieb sahy...@fileaffairs.de:
> 
> Dear colleagues,
> 
> with 3.0 being released and 4.0 being started I'd like to start
> discussing what the major plans are for 4.0. And maybe in a way that
> the release can be made faster than what we had for 3.0. (maybe size it
> in a way that we can do the dev stuff by spring 2024 and then release
> in summer 2024 followed by a 4.1 release to add to that instead of
> doing a big bang like 3.0) 
> 
> Shall we share some ideas via the mailing list or start a page on our
> website (I think ml is easier to do). We can still document the major
> initiatives as soon as we have agreed in a blog post.
> 
> Here are my current thoughts (some of which might also be backported to
> 3.0) in no particular order
> 
> - appareance stream handlers for interactive form widgets (similar to
> what we have for annotations) also allowing one to add their own
> handler
> - replacement or at least new base for XMPBox (current thought is to
> have a new base parser and add if possible XMPBox current end user api
> on top - might be able to reuse xmlgraphics XMP lib). Would allow to
> better deal with XMPs which are not standard and make it easier to add
> to existing XMPs low level.
> - then we had the discussion about an event handler/listener similar to
> what fop provides so one can listen to corrections/repairs done under
> the hood (I know that we can only lay the ground for that as this is a
> major undertaking given all the places where we correct things)
> - enhance the parsing to keep the information about incremental
> versions (better debugging, trace of changes done ...)
> - review and add some more PDF 2.0 capabilities
> - better text formatting/language support (maybe by including fop parts
> or looking into using HarfBuzz)
> - I'd also like to discuss reaching out to fop to look at integrating
> some of their font handling into fontbox
> ...
> 
> That list is already long and I think would be too much given above
> idea of release planning.
> 
> With regards to versioning I'd like to propose that we have 2.0 as LTS
> and 4.x being the next LTS.
> 
> Thoughts
> BR
> Maruan
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 



PDFBox 4.0 and development plans

2023-10-10 Thread sahy...@fileaffairs.de
Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0) 

Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.

Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)
- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)
- review and add some more PDF 2.0 capabilities
- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox
...

That list is already long and I think would be too much given above
idea of release planning.

With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.

Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org