Hi Niclas, thanks for the response.
Niclas Hedhman schrieb:
2011/12/12 Falko Bräutigam <[email protected]>:
It takes ~30s to load 10.000 entities (consisting of ~100k ValueComposites
of different types). After loading all this eats up ~100MB of RAM. This is
to slow and to much.
1. Which store are you using? Neo4j *may* use less RAM than for
instance JDBM. Loading speed is highly EntityStore dependent, and
whether you are using indexing to retrieve those 10,000 entities. The
"raw" speed of Qi4j is probably ~10k entities per second, slowing down
as serialization and I/O become factors.
I'm using Lucene as EntityStore. Each entity is an document. I will test raw
performance of the store and send numbers.
No indexing is involved here. The query loads the entire database.
I'm using Qi4j 1.0. (Yes, I know it's old but I don't have the time to port my
patches and test all together for every Qi4j release). Anyhow, do you think load
speed differs that much between v1.0 and 1.4?
2. Not sure what you are saying. 100,000 values in total? So on
average, each value taken 1kB is "too much"? Well, what is the
composition of those values, otherwise it is hard to analyze.
100,000 values in total. Each entity consists of ~10 values. Each entity takes
10kB and each value takes 1kb on average.
But I think that the general problem is deep inside Qi4j's data
management, which relies on 4 HashMaps that probably occupy a few kB
each, quickly eating away at the memory consumption. Perhaps some
"hinting" system could improve the consumption size. Perhaps even more
clever lookup mechanisms, especially when the number of properties per
entity is low.
10K entities are not much for a GIS application. Given the current memory
foodprint (and 1GB Java heap) not even 10 users can work with the
application concurrently.
Where does the 1GB heap comes from in a multi-user application? It is
hard to discuss in the abstract when your argument is based on
concrete examples.
Sorry, I just wanted to give an example. 1GB is the just "usual" heap size of
our deployments. This is fairly big for our customers but no problem. A problem
*would* be if the application would require 10GB+ of heap.
Usually a GIS application works in a pipelined mode when rendering features
(entities). Memory is never a problem with that architecture. Unfortunatelly
Qi4j holds all entities of an UoW in memory. We discussed this earlier on
this list. So I added a cache SPI to UnitOfWorkInstance. This works but it
does not actually cure the problem of memory consumption because of the time
needed to re-instantiate the entities.
Interestingly enough, I was sketching on another architecture today,
where I observed the "read-only" case being very separated (well, it
is CQRS related) and UnitOfWork isolation not being an issue. If 2.x
would move towards a "read-only"-mode of UoWs, could that solve
"streaming", i.e. as soon as you are done "rendering" you simply drop
the Entity and it is not held in memory?
Exactly. As long as the Entity is not modified it is subject to be GCed anytime.
I'm not quite sure if the OuW API needs to be changed (or maybe I don't get the
idea of "read-only" UoW). Using a copy-on-write cache to handly instances
internally the UoW API does not need to be changed and all the memory (and
loading) problems should go away. Am I missing something here?
If so (to help you in 1.4), does that mean that your "render task"
could be broken up to a series of smaller chunks rendered one at a
time, or are there other constraints preventing this.
This is exactly how rendering is done (for other, non-Qi4j data stores).
Features are fetched in chunks from underlying store. They are passing a
pipeline of chained processors, the renderer is the last processor. A soon as
the feature is rendered it is subject to be GCed.
The problem is that, for this use case I don't need the domain specific layer
(on top of the raw data states) at all. Or at least I don't need the entity
Entity/Mixin/Concern stuff in each and every case. It depends on the processors
in the pipeline. The rendering itself is not domain specific. But a Concern
might by changing a Property, which *could* influence rendering however. So I
was thinking that detaching the modelling layer of Qi4j from the raw data could
be great. Then one could re-use the same Composite instance to access several
(many) entity states. Sort of flyweight.
-Falko
--
Polymap GmbH
Industriestr. 85-95, 04229 Leipzig
Geschäftsführer: Falko Bräutigam
HRB 23133 (Amtsgericht Leipzig)
UST-IdNr.: gemäß § 27 a Umsatzsteuergesetz: DE253001307
Kammerzugehörigkeit: IHK zu Leipzig, IHK zu Rostock
_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev