Re: PerSessionPageStore thread-safety

2020-02-27 Thread Thomas Heigl
Hi Martin,

Could you please explain how exactly Spring Session works ? This will help
> us both understand where the problem comes from.
>

Sure.  From the docs
:

Spring Session provides an API and implementations for managing a user’s
> session information while also making it trivial to support clustered
> sessions without being tied to an application container-specific solution.
> It allows replacing the HttpSession in an application container-neutral
> way, with support for providing session IDs in headers to work with RESTful
> APIs.


Spring Session basically is a servlet filter

that looks up and persists sessions in an external store.
It does so, by wrapping the servlet request and providing a custom
implementation of HttpSession that keeps
track of changes and persists them at the end of the request. It is similar
to what container-based replication
does but is *independent* of the container. That's why I prefer to use it
instead of adding a custom session manager to Tomcat.
It is much easier to customize Spring Session behavior (e.g. how cookies
are stored, etc) because it is part
of your application and not your container.

Also how exactly do you integrate it with Wicket ? Which extension
> points do you use ?


There are *no* integration points with Wicket. Replacing the HttpSession
with Spring Session is completely transparent.
Wicket does not know anything about Spring session. The only difference is
that the session is not held in the containers memory,
but serialized to and from an external source. I have done extensive
testing and except for access synchronization everything
else works as expected.

Since it is backed by Redis it means that it is distributed.
>
But at the same time you try to use PerSessionPageStore which is in-memory
> store, i.e. it lives only in the memory of one of the web container nodes.
> Another node has its own instance of this class.


I want to use PerSessionPageStore as a local cache on each node together
with sticky sessions to avoid
having to go through the whole process of Redis + Deserialization for every
single ajax request.
In a non-clustered environment or an environment where clustering is done
by the servlet container, Wicket
can make use of the session cache that stores the last touched pages in the
session. In case of
Spring Session, the session cache cannot be used. Sessions are
(de)serialized on every request and the session
cache in Wicket is transient.

So strictly speaking, I don't need the PerSessionPageStore but since I
cannot use the session cache, ajax requests
become quite expensive. I'm using it as an application-scoped alternative
to the session cache.

Here is how it works in normal Servler environment:
> .


I completely agree on all your descriptions of session management in a
(clustered) container environment. We have
a shared understanding on container-based session management.

I don't see how application-scope would help at all. As I explained
> earlier PageAccessSynchronizer is per o.a.w.Session instance.
> Maing it application-scoped will only add more concurrency issues and you
> will have to add locks on session level. At the moment it is lock-free.


There will be more concurrency, but that is not really an issue:

The quick replacement for the PageAccessSynchronizer I wrote this morning
uses the following data structure:

private static final ConcurrentMap LOCKS = new
> ConcurrentHashMap<>();


I made it static instead of moving it to the application, because it was
faster to implement. The only difference to
the default synchronizer is the key used for the map. Instead of an integer
for the pageId, I'm using a composite
key of sessionId and pageId.

@Value
> static final class PageKey {
> String sessionId;
> Integer pageId;
> }


All locking/synchronization is done on the PageLock object that is scoped
to a single session by the composite key.
There will be more reads/writes to the concurrent map, but no additional
locking is necessary and unless there are
thousands of requests per second this should hardly be an issue.

Let us first hear back how this will solve the issue and then we will
> discuss such change.


To summarize:

I'm using Spring Session that is a simple, container-neutral, transparent
solution for clustering sessions. Wicket
works fine with Spring Session. Except for one minor issue:
PageAccessSynchronizer does not work because
sessions are not held in memory but (de)serialized on every request. Thus
the internal lock map is not shared
between requests. Moving the lock map into global or application scope
solves the issue with the downside of
increased concurrency to the, now shared, map but should not require any
additional locking or synchronization.
It is quite trivial to write such a global synchronizer, but it could be
much easier if some 

Re: PerSessionPageStore thread-safety

2020-02-27 Thread Martin Grigorov
Hi Thomas,

Could you please explain how exactly Spring Session works ? This will help
us both understand where the problem comes from.
Also how exactly do you integrate it with Wicket ? Which extension points
do you use ?

Since it is backed by Redis it means that it is distributed.
But at the same time you try to use PerSessionPageStore which is in-memory
store, i.e. it lives only in the memory of one of the web container nodes.
Another node has its own instance of this class.

Here is how it works in normal Servler environment:

The HttpSession instances are managed by the web container, e.g. Tomcat.
The container keeps them in memory! The container may persist them for
failover/restart but usually the HttpSessions are kept in memory.

1) if there is just one web container node then Wicket asks the container
for the javax.servler.HttpSession: httpServletRequest.getSession()
Then Wicket extracts the org.apache.wicket.Session from the HttpSession
attributes.
The Wicket Session class has a member instance of PageAccessSynchronizer
that is specific for the current instance of the Session.
PageAccessSynchronizer has a Map, i.e. pageId ->
PageLock.
Whenever a request needs to use a specific Wicket Page then the current
session's PageAccessSynchronizer is used to acquire a lock on it. This
makes sure that a specific page instance is used by at most one request in
the *current* server node.

2) if there is session replication in place, i.e. more than one server
nodes, then:
The web container fetches the HttpSessions for its backend. In case of
Tomcat - it keeps the HttpSessions in memory but there is a ClusterManager
that replicates the HttpSessions' which have been modified, i.e. their
#setAttibute() has been called.
In cluster mode there will be one HttpSession per server node, respectfully
one Wicket Session instance per node, and one PageAccessSynchronizer per
session per node.
Using Wicket or not if you don't use sticky sessions you may face timing
related issues in such environment. Every distributed software has this
problem.


On Wed, Feb 26, 2020 at 9:17 PM Thomas Heigl  wrote:

> Hi Sven,
>
> Thanks for the link but I'm not using asynchronous serialization.
>
> I thought some more about the issue and I think I figured it out. My setup
> looks like this:
>
> 1. Spring Session Redis
> 2. [Session Cache] (Not used because it is transient and stored with
> writeObject/readObject and does not get serialized into Redis as we do not
> use Java serialization)
> 3. PerSessionPageStore (Application-level cache held in memory)
> 4. RedisDataStore (Synchronous)
>
> Observations:
>
> 1. If i disable second-level cache or use the serializing second-level
> cache provided by the DefaultPageManager, there are no issues
> 2. As soon as I enable the PerSessionPageStore we run into concurrency
> issues
>
> Analysis:
>
> I first thought that there were some thread-safety issues
> with PerSessionPageStore but that is not the case because even a fully
> synchronized version shows these problems.
>
> The reason why disabling the 2nd-level cache or using a serializing variant
> works, is because they do not operate on the same *instance* of the page.
> Each thread gets their
> own instance because the page is deserialized before being accessed.
>
> PerSessionPageStore stores the page in memory without serializing it, thus
> all threads share the same instance. This is also the case when using the
> session cache or
> the session-based stores, but the PageAccessSynchronizer bound to the
> session takes care of ensuring that only single request can manipulate the
> page at any given time.
>
> So how does the synchronizer work? It keeps a Map in the
> session and checks whether the page is locked on every request. In a
> non-replicated
> environment this works as expected as the session object lives in the
> servlet container and is the same for each concurrent request. In my case,
> the session
> is not provided by the servlet container, but fetched from Redis by Spring
> Session on every request. So each concurrent thread has *their own version*
> of the session and
> the locks are *not shared between threads* because the session is only
> saved back to Redis after the request has finished.
>
> So the problematic flow looks like this
>
> 1. A request comes in, we fetch the session from Redis, the request
> acquires the session-scoped lock and starts processing
> 2. Before the request is finished, another request comes in, fetches the
> session from Redis, the lock map is empty because the state of request #1
> has not been persisted to Redis
> 3. Now both requests can modify the page and we run into concurrency issues
>
> Summary:
>
> PageAccessSynchronizer does not work with Spring Session or other solutions
> that replace the servlet container session.
>
> Possible solutions:
>
> 1. We could ensure that session locks are updated in Redis immediately but
> that still leaves a couple of milliseconds for race conditions and 

Re: PerSessionPageStore thread-safety

2020-02-27 Thread Thomas Heigl
I created https://issues.apache.org/jira/projects/WICKET/issues/WICKET-6751

Feedback would be very welcome.

Best regards,

Thomas

On Thu, Feb 27, 2020 at 10:08 AM Thomas Heigl  wrote:

> Hi all,
>
> I can confirm that my analysis was correct. I implemented a quick version
> of ApplicationScopedPageAccessSynchronizer and everything works as expected.
>
> Providing a custom PageAccessSynchronizer is very cumbersome at the
> moment. I had to override all concrete methods in the base class and
> completely
> duplicate the PageLock class because waitForRelease and markReleased have
> package visibility.
>
> I would suggest to make these two methods in PageLock public so the class
> can be reused and extract all the locking logic into an ILockManager with 3
> methods: lockPage, unlockPage, and unlockAllPages. The default lock
> manager could hold the session-scoped locks collection and custom
> implementations
> could provide their own locking mechanism.
>
> I will create a JIRA issue for this.
>
> Best regards,
>
> Thomas
>
> On Wed, Feb 26, 2020 at 8:17 PM Thomas Heigl  wrote:
>
>> Hi Sven,
>>
>> Thanks for the link but I'm not using asynchronous serialization.
>>
>> I thought some more about the issue and I think I figured it out. My
>> setup looks like this:
>>
>> 1. Spring Session Redis
>> 2. [Session Cache] (Not used because it is transient and stored with
>> writeObject/readObject and does not get serialized into Redis as we do not
>> use Java serialization)
>> 3. PerSessionPageStore (Application-level cache held in memory)
>> 4. RedisDataStore (Synchronous)
>>
>> Observations:
>>
>> 1. If i disable second-level cache or use the serializing second-level
>> cache provided by the DefaultPageManager, there are no issues
>> 2. As soon as I enable the PerSessionPageStore we run into concurrency
>> issues
>>
>> Analysis:
>>
>> I first thought that there were some thread-safety issues
>> with PerSessionPageStore but that is not the case because even a fully
>> synchronized version shows these problems.
>>
>> The reason why disabling the 2nd-level cache or using a serializing
>> variant works, is because they do not operate on the same *instance* of the
>> page. Each thread gets their
>> own instance because the page is deserialized before being accessed.
>>
>> PerSessionPageStore stores the page in memory without serializing it,
>> thus all threads share the same instance. This is also the case when using
>> the session cache or
>> the session-based stores, but the PageAccessSynchronizer bound to the
>> session takes care of ensuring that only single request can manipulate the
>> page at any given time.
>>
>> So how does the synchronizer work? It keeps a Map in
>> the session and checks whether the page is locked on every request. In a
>> non-replicated
>> environment this works as expected as the session object lives in the
>> servlet container and is the same for each concurrent request. In my case,
>> the session
>> is not provided by the servlet container, but fetched from Redis by
>> Spring Session on every request. So each concurrent thread has *their own
>> version* of the session and
>> the locks are *not shared between threads* because the session is only
>> saved back to Redis after the request has finished.
>>
>> So the problematic flow looks like this
>>
>> 1. A request comes in, we fetch the session from Redis, the request
>> acquires the session-scoped lock and starts processing
>> 2. Before the request is finished, another request comes in, fetches the
>> session from Redis, the lock map is empty because the state of request #1
>> has not been persisted to Redis
>> 3. Now both requests can modify the page and we run into concurrency
>> issues
>>
>> Summary:
>>
>> PageAccessSynchronizer does not work with Spring Session or other
>> solutions that replace the servlet container session.
>>
>> Possible solutions:
>>
>> 1. We could ensure that session locks are updated in Redis immediately
>> but that still leaves a couple of milliseconds for race conditions and adds
>> a lot of overhead
>> 2. We could use an application-scoped PageAccessSynchronizer. This solves
>> the problem as long as sessions are sticky and all concurrent requests are
>> sent to the same server.
>> 3. If we want to use non-sticky session we could use Redis locks for
>> implementing a global PageAccessSynchronizer
>>
>> I would like to go with solution #2 for now. The problem is
>> that PageAccessSynchronizer is not an interface.
>>
>> Would it be possible to extract an interface so I can easily implement
>> access synchronizers with different scopes?
>>
>> Best regards,
>>
>> Thomas
>>
>>
>>
>>
>>
>> On Wed, Feb 26, 2020 at 12:24 PM Sven Meier  wrote:
>>
>>> Hi Thomas,
>>>
>>> Im wondering whether you're running into
>>> https://issues.apache.org/jira/browse/WICKET-6702
>>>
>>> I've been working on a solution to that problem, which is caused by
>>> pages being asynchronously serialized while another request is already
>>> 

Re: PerSessionPageStore thread-safety

2020-02-27 Thread Thomas Heigl
Hi all,

I can confirm that my analysis was correct. I implemented a quick version
of ApplicationScopedPageAccessSynchronizer and everything works as expected.

Providing a custom PageAccessSynchronizer is very cumbersome at the moment.
I had to override all concrete methods in the base class and completely
duplicate the PageLock class because waitForRelease and markReleased have
package visibility.

I would suggest to make these two methods in PageLock public so the class
can be reused and extract all the locking logic into an ILockManager with 3
methods: lockPage, unlockPage, and unlockAllPages. The default lock manager
could hold the session-scoped locks collection and custom implementations
could provide their own locking mechanism.

I will create a JIRA issue for this.

Best regards,

Thomas

On Wed, Feb 26, 2020 at 8:17 PM Thomas Heigl  wrote:

> Hi Sven,
>
> Thanks for the link but I'm not using asynchronous serialization.
>
> I thought some more about the issue and I think I figured it out. My setup
> looks like this:
>
> 1. Spring Session Redis
> 2. [Session Cache] (Not used because it is transient and stored with
> writeObject/readObject and does not get serialized into Redis as we do not
> use Java serialization)
> 3. PerSessionPageStore (Application-level cache held in memory)
> 4. RedisDataStore (Synchronous)
>
> Observations:
>
> 1. If i disable second-level cache or use the serializing second-level
> cache provided by the DefaultPageManager, there are no issues
> 2. As soon as I enable the PerSessionPageStore we run into concurrency
> issues
>
> Analysis:
>
> I first thought that there were some thread-safety issues
> with PerSessionPageStore but that is not the case because even a fully
> synchronized version shows these problems.
>
> The reason why disabling the 2nd-level cache or using a serializing
> variant works, is because they do not operate on the same *instance* of the
> page. Each thread gets their
> own instance because the page is deserialized before being accessed.
>
> PerSessionPageStore stores the page in memory without serializing it, thus
> all threads share the same instance. This is also the case when using the
> session cache or
> the session-based stores, but the PageAccessSynchronizer bound to the
> session takes care of ensuring that only single request can manipulate the
> page at any given time.
>
> So how does the synchronizer work? It keeps a Map in
> the session and checks whether the page is locked on every request. In a
> non-replicated
> environment this works as expected as the session object lives in the
> servlet container and is the same for each concurrent request. In my case,
> the session
> is not provided by the servlet container, but fetched from Redis by Spring
> Session on every request. So each concurrent thread has *their own version*
> of the session and
> the locks are *not shared between threads* because the session is only
> saved back to Redis after the request has finished.
>
> So the problematic flow looks like this
>
> 1. A request comes in, we fetch the session from Redis, the request
> acquires the session-scoped lock and starts processing
> 2. Before the request is finished, another request comes in, fetches the
> session from Redis, the lock map is empty because the state of request #1
> has not been persisted to Redis
> 3. Now both requests can modify the page and we run into concurrency issues
>
> Summary:
>
> PageAccessSynchronizer does not work with Spring Session or other
> solutions that replace the servlet container session.
>
> Possible solutions:
>
> 1. We could ensure that session locks are updated in Redis immediately but
> that still leaves a couple of milliseconds for race conditions and adds a
> lot of overhead
> 2. We could use an application-scoped PageAccessSynchronizer. This solves
> the problem as long as sessions are sticky and all concurrent requests are
> sent to the same server.
> 3. If we want to use non-sticky session we could use Redis locks for
> implementing a global PageAccessSynchronizer
>
> I would like to go with solution #2 for now. The problem is
> that PageAccessSynchronizer is not an interface.
>
> Would it be possible to extract an interface so I can easily implement
> access synchronizers with different scopes?
>
> Best regards,
>
> Thomas
>
>
>
>
>
> On Wed, Feb 26, 2020 at 12:24 PM Sven Meier  wrote:
>
>> Hi Thomas,
>>
>> Im wondering whether you're running into
>> https://issues.apache.org/jira/browse/WICKET-6702
>>
>> I've been working on a solution to that problem, which is caused by pages
>> being asynchronously serialized while another request is already coming in.
>>
>> Or maybe it is something different.
>> Could you create a quickstart?
>>
>> Sven
>>
>> Am 25. Februar 2020 22:12:46 MEZ schrieb Thomas Heigl <
>> tho...@umschalt.com>:
>> >Hi again,
>> >
>> >I investigated a bit and it does not seem to have anything to do with
>> >the
>> >PerSessionPageStore. I implemented a completely