Re: Maven 4 performances problems

2021-07-05 Thread Guillaume Nodet
I've raised a PR for review
   https://github.com/apache/maven/pull/486/
This kina rewrites the build/consumer transformation to reuse the xml pull
parser and do the transformation on the fly instead of reading and
transforming with a sax api and writing to a stream so that the pull parser
can read it again.
At first glance, it seems to get the build/consumer feature with
performances much closer to the performances when the feature is disabled:

   Maven 4 with build/consumer:  23,12s
   Maven 4 w/out build/consumer: 22,43s

I'll investigate the failure on windows tomorrow, but I suspect a line
ending problem...  I'll also raise a JIRA to track this improvement.
I've done some measurements with the sync factory, but that did not seem to
change much for this use case.

Guillaume

Le jeu. 1 juil. 2021 à 11:19, Guillaume Nodet  a écrit :

> I've been running a few tests to measure performances.
> This simplistic test looks like running the following command in a loop
> and measure the execution time.  This is done on a quite big project so
> that a bunch of pom files are actually read.
>
> for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
> -Dmaven.experimental.buildconsumer=true help:evaluate
> -Dexpression=java.io.tmpdir -DforceStdout -q ; done
>
> The average results are the following:
>Maven 4 with build/consumer:  28,40s
>Maven 4 w/out build/consumer: 23,43s
>Maven 3:  21,54s
>
> I find the 20% performance loss of the build/consumer feature quite
> problematic.  I hinted about those possible performance problems when
> reviewing the original PR, so I'd like to see if I can investigate a
> different way of achieving the transformation.  I think the main
> performance cost comes from using the following pattern:
>   read file -> parse using JAXP -> transform using TRAX -> write to stream
>   read stream -> parse using XPP3
> The first step is performed in a separate thread and the output written to
> a pipe stream which is used as the input of the usual pom parser.  This
> double parsing step, in addition to using the JAXP / TRAX api, which is not
> the fastest one, comes at a heavy cost.
>
> I see two ways to solve the problem:
>   * refactor the build/consumer feature to use a different API so that the
> parsing can be done in a single step (this would mean defining an XmlFilter
> interface to do the filtering and wrapping it inside an XmlPullParser)
>   * get rid of the Xpp3 implementation and use the more common Stax api
> which already defines filters
>
> The second option has some drawbacks though: all the plugin configuration
> done using Xpp3Dom would not work anymore, so this is a very big and
> incompatible change.
>
> I'm thus willing to investigate the first option and see what can be
> done.  If there's a consensus, I'll start working on a POC about the api /
> filters and will get back to this list with some more information.
>
> --
> 
> Guillaume Nodet
>
>

-- 

Guillaume Nodet


Re: Maven 4 performances problems

2021-07-01 Thread Romain Manni-Bucau
+1 to fix it before any > alpha release but also agree it is fine to let
the alpha-1 go out while it is explicitly mentionned as known and under
work.

Romain Manni-Bucau
@rmannibucau  |  Blog
 | Old Blog
 | Github  |
LinkedIn  | Book



Le jeu. 1 juil. 2021 à 14:13, Michael Osipov  a écrit :

> Am 2021-07-01 um 11:19 schrieb Guillaume Nodet:
> > I've been running a few tests to measure performances.
> > This simplistic test looks like running the following command in a loop
> and
> > measure the execution time.  This is done on a quite big project so that
> a
> > bunch of pom files are actually read.
> >
> > for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
> > -Dmaven.experimental.buildconsumer=true help:evaluate
> > -Dexpression=java.io.tmpdir -DforceStdout -q ; done
> >
> > The average results are the following:
> > Maven 4 with build/consumer:  28,40s
> > Maven 4 w/out build/consumer: 23,43s
> > Maven 3:  21,54s
>
> Kindly try also with -Daether.syncContext.named.factory=nolock
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
> For additional commands, e-mail: dev-h...@maven.apache.org
>
>


Re: Maven 4 performances problems

2021-07-01 Thread Michael Osipov

Am 2021-07-01 um 11:19 schrieb Guillaume Nodet:

I've been running a few tests to measure performances.
This simplistic test looks like running the following command in a loop and
measure the execution time.  This is done on a quite big project so that a
bunch of pom files are actually read.

for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
-Dmaven.experimental.buildconsumer=true help:evaluate
-Dexpression=java.io.tmpdir -DforceStdout -q ; done

The average results are the following:
Maven 4 with build/consumer:  28,40s
Maven 4 w/out build/consumer: 23,43s
Maven 3:  21,54s


Kindly try also with -Daether.syncContext.named.factory=nolock

-
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org



Re: Maven 4 performances problems

2021-07-01 Thread Guillaume Nodet
No, I don't think this requires a delay in the alpha-1 release.

Le jeu. 1 juil. 2021 à 12:27, Robert Scholte  a
écrit :

> Should we postpone the alpha-1 release because of this?
> For me the most important reason for alpha-1 is to get confirmation that
> builds won't be broken due to build/consumer.
> But if users simply look at buildtime and due to some slower result don't
> care for the other changes, then we shouldn't do this release now.
>
> Robert
>
>
>
> On 1-7-2021 11:20:17, Guillaume Nodet  wrote:
> I've been running a few tests to measure performances.
> This simplistic test looks like running the following command in a loop and
> measure the execution time. This is done on a quite big project so that a
> bunch of pom files are actually read.
>
> for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
> -Dmaven.experimental.buildconsumer=true help:evaluate
> -Dexpression=java.io.tmpdir -DforceStdout -q ; done
>
> The average results are the following:
> Maven 4 with build/consumer: 28,40s
> Maven 4 w/out build/consumer: 23,43s
> Maven 3: 21,54s
>
> I find the 20% performance loss of the build/consumer feature quite
> problematic. I hinted about those possible performance problems when
> reviewing the original PR, so I'd like to see if I can investigate a
> different way of achieving the transformation. I think the main
> performance cost comes from using the following pattern:
> read file -> parse using JAXP -> transform using TRAX -> write to stream
> read stream -> parse using XPP3
> The first step is performed in a separate thread and the output written to
> a pipe stream which is used as the input of the usual pom parser. This
> double parsing step, in addition to using the JAXP / TRAX api, which is not
> the fastest one, comes at a heavy cost.
>
> I see two ways to solve the problem:
> * refactor the build/consumer feature to use a different API so that the
> parsing can be done in a single step (this would mean defining an XmlFilter
> interface to do the filtering and wrapping it inside an XmlPullParser)
> * get rid of the Xpp3 implementation and use the more common Stax api
> which already defines filters
>
> The second option has some drawbacks though: all the plugin configuration
> done using Xpp3Dom would not work anymore, so this is a very big and
> incompatible change.
>
> I'm thus willing to investigate the first option and see what can be done.
> If there's a consensus, I'll start working on a POC about the api / filters
> and will get back to this list with some more information.
>
> --
> 
> Guillaume Nodet
>


-- 

Guillaume Nodet


Re: Maven 4 performances problems

2021-07-01 Thread Anders Hammar
I agree. We could even mention this known performance "issue" so there
wouldn't be any surprise.

/Anders

On Thu, Jul 1, 2021 at 12:29 PM Enrico Olivelli  wrote:

> Il giorno gio 1 lug 2021 alle ore 12:27 Robert Scholte <
> rfscho...@apache.org>
> ha scritto:
>
> > Should we postpone the alpha-1 release because of this?
> > For me the most important reason for alpha-1 is to get confirmation that
> > builds won't be broken due to build/consumer.
> >
>
> 100% agreed
> it is an ALPHA and there are many cool features, it is worth to give it to
> the users and get feedback
>
> my two cents
> Enrico
>
>
> > But if users simply look at buildtime and due to some slower result don't
> > care for the other changes, then we shouldn't do this release now.
>
>
> > Robert
> >
> >
> >
> > On 1-7-2021 11:20:17, Guillaume Nodet  wrote:
> > I've been running a few tests to measure performances.
> > This simplistic test looks like running the following command in a loop
> and
> > measure the execution time. This is done on a quite big project so that a
> > bunch of pom files are actually read.
> >
> > for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
> > -Dmaven.experimental.buildconsumer=true help:evaluate
> > -Dexpression=java.io.tmpdir -DforceStdout -q ; done
> >
> > The average results are the following:
> > Maven 4 with build/consumer: 28,40s
> > Maven 4 w/out build/consumer: 23,43s
> > Maven 3: 21,54s
> >
> > I find the 20% performance loss of the build/consumer feature quite
> > problematic. I hinted about those possible performance problems when
> > reviewing the original PR, so I'd like to see if I can investigate a
> > different way of achieving the transformation. I think the main
> > performance cost comes from using the following pattern:
> > read file -> parse using JAXP -> transform using TRAX -> write to stream
> > read stream -> parse using XPP3
> > The first step is performed in a separate thread and the output written
> to
> > a pipe stream which is used as the input of the usual pom parser. This
> > double parsing step, in addition to using the JAXP / TRAX api, which is
> not
> > the fastest one, comes at a heavy cost.
> >
> > I see two ways to solve the problem:
> > * refactor the build/consumer feature to use a different API so that the
> > parsing can be done in a single step (this would mean defining an
> XmlFilter
> > interface to do the filtering and wrapping it inside an XmlPullParser)
> > * get rid of the Xpp3 implementation and use the more common Stax api
> > which already defines filters
> >
> > The second option has some drawbacks though: all the plugin configuration
> > done using Xpp3Dom would not work anymore, so this is a very big and
> > incompatible change.
> >
> > I'm thus willing to investigate the first option and see what can be
> done.
> > If there's a consensus, I'll start working on a POC about the api /
> filters
> > and will get back to this list with some more information.
> >
> > --
> > 
> > Guillaume Nodet
> >
>


Re: Maven 4 performances problems

2021-07-01 Thread Enrico Olivelli
Il giorno gio 1 lug 2021 alle ore 12:27 Robert Scholte 
ha scritto:

> Should we postpone the alpha-1 release because of this?
> For me the most important reason for alpha-1 is to get confirmation that
> builds won't be broken due to build/consumer.
>

100% agreed
it is an ALPHA and there are many cool features, it is worth to give it to
the users and get feedback

my two cents
Enrico


> But if users simply look at buildtime and due to some slower result don't
> care for the other changes, then we shouldn't do this release now.


> Robert
>
>
>
> On 1-7-2021 11:20:17, Guillaume Nodet  wrote:
> I've been running a few tests to measure performances.
> This simplistic test looks like running the following command in a loop and
> measure the execution time. This is done on a quite big project so that a
> bunch of pom files are actually read.
>
> for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
> -Dmaven.experimental.buildconsumer=true help:evaluate
> -Dexpression=java.io.tmpdir -DforceStdout -q ; done
>
> The average results are the following:
> Maven 4 with build/consumer: 28,40s
> Maven 4 w/out build/consumer: 23,43s
> Maven 3: 21,54s
>
> I find the 20% performance loss of the build/consumer feature quite
> problematic. I hinted about those possible performance problems when
> reviewing the original PR, so I'd like to see if I can investigate a
> different way of achieving the transformation. I think the main
> performance cost comes from using the following pattern:
> read file -> parse using JAXP -> transform using TRAX -> write to stream
> read stream -> parse using XPP3
> The first step is performed in a separate thread and the output written to
> a pipe stream which is used as the input of the usual pom parser. This
> double parsing step, in addition to using the JAXP / TRAX api, which is not
> the fastest one, comes at a heavy cost.
>
> I see two ways to solve the problem:
> * refactor the build/consumer feature to use a different API so that the
> parsing can be done in a single step (this would mean defining an XmlFilter
> interface to do the filtering and wrapping it inside an XmlPullParser)
> * get rid of the Xpp3 implementation and use the more common Stax api
> which already defines filters
>
> The second option has some drawbacks though: all the plugin configuration
> done using Xpp3Dom would not work anymore, so this is a very big and
> incompatible change.
>
> I'm thus willing to investigate the first option and see what can be done.
> If there's a consensus, I'll start working on a POC about the api / filters
> and will get back to this list with some more information.
>
> --
> 
> Guillaume Nodet
>


Re: Maven 4 performances problems

2021-07-01 Thread Robert Scholte
Should we postpone the alpha-1 release because of this?
For me the most important reason for alpha-1 is to get confirmation that builds 
won't be broken due to build/consumer.
But if users simply look at buildtime and due to some slower result don't care 
for the other changes, then we shouldn't do this release now.

Robert



On 1-7-2021 11:20:17, Guillaume Nodet  wrote:
I've been running a few tests to measure performances.
This simplistic test looks like running the following command in a loop and
measure the execution time. This is done on a quite big project so that a
bunch of pom files are actually read.

for i in 1 2 3 4 5 6 7 8 9 10 ; do time $MAVEN_HOME/bin/mvn -DskipTests
-Dmaven.experimental.buildconsumer=true help:evaluate
-Dexpression=java.io.tmpdir -DforceStdout -q ; done

The average results are the following:
Maven 4 with build/consumer: 28,40s
Maven 4 w/out build/consumer: 23,43s
Maven 3: 21,54s

I find the 20% performance loss of the build/consumer feature quite
problematic. I hinted about those possible performance problems when
reviewing the original PR, so I'd like to see if I can investigate a
different way of achieving the transformation. I think the main
performance cost comes from using the following pattern:
read file -> parse using JAXP -> transform using TRAX -> write to stream
read stream -> parse using XPP3
The first step is performed in a separate thread and the output written to
a pipe stream which is used as the input of the usual pom parser. This
double parsing step, in addition to using the JAXP / TRAX api, which is not
the fastest one, comes at a heavy cost.

I see two ways to solve the problem:
* refactor the build/consumer feature to use a different API so that the
parsing can be done in a single step (this would mean defining an XmlFilter
interface to do the filtering and wrapping it inside an XmlPullParser)
* get rid of the Xpp3 implementation and use the more common Stax api
which already defines filters

The second option has some drawbacks though: all the plugin configuration
done using Xpp3Dom would not work anymore, so this is a very big and
incompatible change.

I'm thus willing to investigate the first option and see what can be done.
If there's a consensus, I'll start working on a POC about the api / filters
and will get back to this list with some more information.

--

Guillaume Nodet