Raphael Luta wrote:
> > [EMAIL PROTECTED] wrote:
> > Whether it is better to store the PSML reference in a session or in
some
> > other data structure depends on the portal usage pattern. We envision a
> > usage
> > pattern where a large number of users accesses the portal mostly
between 0
> > and 5 times a day; each user can have a personalized page or gets the
> > default
> > page. We expect a large percentage of users to just use the default
page.
> >
> > Storing the reference to the PSML datastructure in the session is the
most
> > adequate solution for this scenario, as the PSML file is only parsed
once
> > per session and can be garbage collected immediately after the session
> > expires. As each user has a personal page, the PSML cannot be shared
> > between users in our scenario.
> >
>
> If I understand your scenario correctly:
> - you'll never really use the cache for the personalized pages because
> your users will probably expire their sessions between requests (as
> there are few request/day expected)
Not quite. Users may e.g. access the portal once per day, view their
personal
portal page, click on a link, go back to their page, click on another link,
etc.
In this case the data in the session or in the cache would be used.
> - you'll heavily benefit from having the default PSML always in cache.
>
> In what respect a MRU cache would *not* fit your needs ?
Would you use an MRU cache implementation similar to that described at
http://developer.java.sun.com/developer/onlineTraining/Programming/JDCBook/perf4.html
?
I'm not sure how a MRU cache would perform under high load, with
some ten-thousands of concurrent users on an app server with a big
thread pool running on a multi-processor machine - there might be
synchronization issues resulting in temporarily blocked threads.
The MRU cache probably needs synchronization of the methods to
get/remove elements to/from the cache and needs to find/remove
an element of the list and add it at the head of the list for
each cache hit ... You may need to synchronize on the cache while
you are reading and parsing the PSML to get the object structure to
put in the cache. All threads handling concurrent requests that
want to get something from the cache during that time would be
blocked until the cache has been updated. As it takes some time to
read a PSML file from disk and create the object structure that
represents it, it may happen that the threads that cause cache
misses block the threads that would cause cache hits for a
significant amount of time.
If we store the object structure once per user session,
all threads can still run in parallel without synchronization.
Application servers can manage sessions, use of self-programmed
caches takes away control from them.
My feeling is that if you run a portal in an environment where
performance really counts, lets say on a machine with 8+ processors
and some GB of RAM) we may be better off by just accepting a certain
memory footprint per session, calculate the amount of RAM needed an
plug it into the machine. I don't know how big the object tree
representing the PSML can get, but if we assume 40 KB and 10000
concurrent sessions, this would mean a memory usage of only 400 MB.
The other question I have is how we would handle changes of the
user's PSML ? Assume we have a page customizer and a user changes
his page layout. Would the page customizer remove or update the
PSML in the cache ? Or would the cache check whether the file has
changed each time it is accessed ?
> The only disadvantage I see to the MRU is that it will use a
> little more memory under small load (because the pages will
> be persisted in the cache and not released).
> Once the MRU is full or nearly full, the behavior and cost
> associated to the MRU should be about the same that the cost
> associated to the session cache. Am I missing something here ?
Behavior under small load does not worry me.
> > I understand that there are other cases, where PSML files can
> > be shared between users. Is the time consumed for parsing the
> > PSML and generating the object tree that represents it or
> > memory usage per session the problem that you see when using
> > the session approach ?
> >
>
> My main concern about using a session cache is that we're making
> a usage pattern asssumption which may not be true in some
> installations. I'd like the portal engine to be agnostic to usage
> pattern.
The MRU cache relies on the assumption that it is advantageous
to cache the n last recently used elements. If for example you
have 10000 users with custom pages, who request their home page
every ten minutes (home page, read article, back to home page,
read article, back to home page, read article, ...) you get a
mean of 1000 requests for the home page per minute. If you have a
MRU cache with a size of 11000 you're fine, if the MRU cache can
store 9000 elements, it may happen that you only get cache misses
all the time, because a user's PSML is always discarded just
before the user would have accessed his home page again.
> Optimization for a given pattern should be handled at a
> pluggable component level.
I surely agree with that. We might have an MRUCache and a
"SessionCache". We could let them implement the same interface
to make them exchangeable. I guess we'd have to pass RunData
to allow a cache implementation to put data in the session
if required. A property may be used to determine which strategy
to use.
It would be ok to start with a MRU cache if it is possible to
replace it with a "cache" that uses the session to store data
when required.
> The Profiler component is currently responible for implementing
> the usage pattern, maybe we can add methods to the Profiler API
> to allow a profiler to provide caching hints for a cache system ?
A hint might be "per-user", "per-group" or "global", perhaps. For
per-user data, the "SessionCache" might be used, for per-group or
global data the MRU cache. That would add some additional complexity,
though.
It seems the UserProfiler also does some caching. Would this mean
we cache the PSML file in the DiskCache and the object tree that
is generated from it in the MRU cache ?
Best regards,
Thomas
Thomas Schaeck
IBM Pervasive Computing Division
Phone: +49-(0)7031-16-3479 e-mail: [EMAIL PROTECTED]
Address: IBM Deutschland Entwicklung GmbH,
Schoenaicher Str. 220, 71032 Boeblingen, Germany
--
--------------------------------------------------------------
Please read the FAQ! <http://java.apache.org/faq/>
To subscribe: [EMAIL PROTECTED]
To unsubscribe: [EMAIL PROTECTED]
Archives and Other: <http://marc.theaimsgroup.com/?l=jetspeed>
Problems?: [EMAIL PROTECTED]