On 05/02/2007, at 12:15 PM, Jason van Zyl wrote:
It's bleeding memory because there are thousands of POMs in the
repository, and the hashmap has no limiting. A cache would allow
them to be expired, size, limited, etc. so the memory consumption
is fixed.
That's just not the function of the builder. That's just wrong as
it's just not the job of the builder to be storing them like that.
I've already agreed with that multiple times. I'm trying to explain
why it's a problem now.
Not really understanding here. Are you saying no cache is
necessary, or it can be done elsewhere?
In the builder no cache should be present. It was jacked in there
to make reactor builds work a little quicker.
That doesn't answer my question. No cache in Maven at all, or
somewhere else? (Neither option being in the project builder).
Dependency POMs are built many times across the life of a maven
build, so the cache is certainly improving performance compared to
reading them back from the file system in the local repository.
I don't think anyone has ever measured how long it actually takes
to build a MavenProject. I don't think we want to put all that
logic in there anyway, because if you start using for things like
Archiva then you start having to check timestamps on disk or other
source changes. I think the caching should just be removed from the
builder.
I'm not talking about Archiva here, and I'm not talking about keeping
a cache in the project builder.
We can measure it to find out later, but I'm almost certain some
level of caching will be needed somewhere (probably the resolver, or
a new component) because of the amount of times the POMs are read in
Maven itself. It may not be a flat POM cache, it might be a different
way the resolver uses and stores the data it gets back.
Anyway, as the builder is used today, I am sure that just removing
the caching will degrade performance as it will read back from the
disk a number of times. Artifact and project builder changes may
change that, but I'm trying to take this one step at a time.
In something like Archiva the model should be placed in a store,
not cached so I don't think it really helps there a whole lot.
That's a separate discussion, but the repository is the store.
That's one store. I'm actually sticking them in an object database
and who knows what other third party tool folks will do.
That's fine, but it is a store that needs to read POMs.
Only a subset of the POM is stored in the index, and there is no
point duplicating the POM it into some other store.
If you have read a released POM and have stored to a database why
would you ever need to build it again? Barring silly people re-
releasing the same things for a release consisting of all releases
that closure could be stored and really not need to ever be read
from a pom.xml file again.
I doubt duplicating the data in a database will make it any faster
than using the project builder. And in both cases, a cache will be
needed. But anyway, not relevant.
So if we put that in there and then two weeks from now I pull it
out and remove the caching you guys aren't going to complain?
No, because there's no way I'd use 2.1-SNAPSHOT in Archiva. Once it
gets released, we can migrate to any new APIs, but I'm pretty sick of
chasing moving API targets.
If Joakim wanted to take a stab at that I would work with him for a
day to try and fix the builder properly. I'm sure in a day between
the two of us we could chop the project builder apart and fix it
properly. The cache in there is a band aid.
That's up to entirely up to him, but I certainly won't advocate using
trunk in Archiva yet.
- Brett
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]