TomcatCluster data replication

2011-04-06 Thread Jürgen Jakobitsch
hi,

i'm in need of data replication in a tomcat-cluster.
i set up a tomcat cluster of three tomcats on a single machine with a apache 
(mod_jk) front that does the load balacing.
everything works absolutely charming for reading requests, my trouble start 
with data input.

what i'm trying to achieve is that if i submit data with a html form, the 
storage on all cluster members needs to be updated.
i'm using an openrdf's sesame triple store which locks it's data directory so i 
can't simply use a single shared directory
in my application.

what i have in mind, after first readings, is some sort of clustervalve that 
checks, if a request is a POST request and if
yes, sends this request (which updates the repository in the back) to all 
members of the cluster.

so here would be my questions :

1. is there a standard way of doing something like (which a not-clusterable 
data-backend)
2. is the thing with the clustervalve in fact the correct starting point

any help or pointer to the right direction greatly appreciated

wkr turnguard.com/turnguard

-- 
punkt. netServices
__
Jürgen Jakobitsch
Codeography

Lerchenfelder Gürtel 43 Top 5/2
A - 1160 Wien
Tel.: 01 / 897 41 22 - 29
Fax: 01 / 897 41 22 - 22

netServices http://www.punkt.at


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: TomcatCluster data replication

2011-04-06 Thread Filip Hanik - Dev Lists

On 4/6/2011 1:22 PM, Jürgen Jakobitsch wrote:

hi,

i'm in need of data replication in a tomcat-cluster.
i set up a tomcat cluster of three tomcats on a single machine with a apache 
(mod_jk) front that does the load balacing.
everything works absolutely charming for reading requests, my trouble start 
with data input.

what i'm trying to achieve is that if i submit data with a html form, the 
storage on all cluster members needs to be updated.
i'm using an openrdf's sesame triple store which locks it's data directory so i 
can't simply use a single shared directory
in my application.


sounds like a limitation of sesame. Use some other noSQL data store and you 
wont have this issue

best
Filip


what i have in mind, after first readings, is some sort of clustervalve that 
checks, if a request is a POST request and if
yes, sends this request (which updates the repository in the back) to all 
members of the cluster.

so here would be my questions :

1. is there a standard way of doing something like (which a not-clusterable 
data-backend)
2. is the thing with the clustervalve in fact the correct starting point

any help or pointer to the right direction greatly appreciated

wkr turnguard.com/turnguard




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: TomcatCluster data replication

2011-04-06 Thread André Warnier

Jürgen Jakobitsch wrote:

hi,

i'm in need of data replication in a tomcat-cluster.
i set up a tomcat cluster of three tomcats on a single machine with a apache 
(mod_jk) front that does the load balacing.
everything works absolutely charming for reading requests, my trouble start 
with data input.

what i'm trying to achieve is that if i submit data with a html form, the 
storage on all cluster members needs to be updated.
i'm using an openrdf's sesame triple store which locks it's data directory so i 
can't simply use a single shared directory
in my application.

what i have in mind, after first readings, is some sort of clustervalve that 
checks, if a request is a POST request and if
yes, sends this request (which updates the repository in the back) to all 
members of the cluster.

so here would be my questions :

1. is there a standard way of doing something like (which a not-clusterable 
data-backend)


No.


2. is the thing with the clustervalve in fact the correct starting point


Probably not.



any help or pointer to the right direction greatly appreciated

I'm not saying that it would not be possible to do this.  And I have no idea what a 
openrdf's sesame triple store is.
But what you describe sounds more like something that should be handled at the level of 
the application which processes the POST.  It is the application which should arrange to 
update the nn back-end data stores at the same time.  Of course that introduces some 
interesting issues of locking and synchronisation, in case two quasi-simultaneous requests 
handled by two separate tomcats try to update the same piece of data in each of the 
datastores.


Now just by curiosity, what is the real-world point of this setup, considering that your 3 
tomcats are running on the same host ?

Why not have a single Tomcat with 3 times more resources, to handle all the 
requests ?

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: TomcatCluster data replication

2011-04-06 Thread Jürgen Jakobitsch
hi, thanks for your input..

1. switching that backend is apparently not an option, i wouldn't have asked 
with respect to a non-clusterable data-backend
2. it wouldn't be that two request update one piece of data, but it would be 
that the first cluster member that receives 
   a POST request, posts that request also to other members, these then simply 
handle this POST request. Since every 
   application has it's own datadirectory every member would write into it's 
own datadirectory, that's why the requests
   need to be forwarded to all members of the cluster.
3. these three tomcats on one machine are for testing purposes only - real 
world would go on different physical machines.

image you have a simple text file in the WEB-INF directory of a webapp named 
ClusterApp. this ClusterApp is deployed 
on three tomcats in a cluster. now comes a POST request, that updates the text 
file (adds one line to it).
now of course i need to synchronize the text file on all tomcats in the cluster.

in my opinion there are only a few options to achieve this :
1. rsync the file, which is kind of hard, since i have a load balancer and 
don't know exactly which member answers the request, there are 
   to many insecurities
2. check all incoming requests for HTTP POST, if the request is a POST the send 
it simply to all members of the cluster.


honestly i can hardly imagine that i'm the first to come across this usecase...


any help really appreciated..
wkr turnguard.com/turnguard


- Original Message -
From: André Warnier a...@ice-sa.com
To: Tomcat Users List users@tomcat.apache.org
Sent: Wednesday, April 6, 2011 9:43:02 PM
Subject: Re: TomcatCluster data replication

Jürgen Jakobitsch wrote:
 hi,
 
 i'm in need of data replication in a tomcat-cluster.
 i set up a tomcat cluster of three tomcats on a single machine with a apache 
 (mod_jk) front that does the load balacing.
 everything works absolutely charming for reading requests, my trouble start 
 with data input.
 
 what i'm trying to achieve is that if i submit data with a html form, the 
 storage on all cluster members needs to be updated.
 i'm using an openrdf's sesame triple store which locks it's data directory so 
 i can't simply use a single shared directory
 in my application.
 
 what i have in mind, after first readings, is some sort of clustervalve that 
 checks, if a request is a POST request and if
 yes, sends this request (which updates the repository in the back) to all 
 members of the cluster.
 
 so here would be my questions :
 
 1. is there a standard way of doing something like (which a not-clusterable 
 data-backend)

No.

 2. is the thing with the clustervalve in fact the correct starting point

Probably not.

 
 any help or pointer to the right direction greatly appreciated
 
I'm not saying that it would not be possible to do this.  And I have no idea 
what a 
openrdf's sesame triple store is.
But what you describe sounds more like something that should be handled at the 
level of 
the application which processes the POST.  It is the application which should 
arrange to 
update the nn back-end data stores at the same time.  Of course that introduces 
some 
interesting issues of locking and synchronisation, in case two 
quasi-simultaneous requests 
handled by two separate tomcats try to update the same piece of data in each of 
the 
datastores.

Now just by curiosity, what is the real-world point of this setup, considering 
that your 3 
tomcats are running on the same host ?
Why not have a single Tomcat with 3 times more resources, to handle all the 
requests ?

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


-- 
punkt. netServices
__
Jürgen Jakobitsch
Codeography

Lerchenfelder Gürtel 43 Top 5/2
A - 1160 Wien
Tel.: 01 / 897 41 22 - 29
Fax: 01 / 897 41 22 - 22

netServices http://www.punkt.at


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: TomcatCluster data replication

2011-04-06 Thread André Warnier

Jürgen Jakobitsch wrote:
...


image you have a simple text file in the WEB-INF directory of a webapp named ClusterApp. this ClusterApp is deployed 
on three tomcats in a cluster. now comes a POST request, that updates the text file (adds one line to it).

now of course i need to synchronize the text file on all tomcats in the cluster.


Ok, let's imagine there are initially 3 identical simple text files, on each of 
the 3 tomcats.
And there are 2 clients accessing the load balancer.
In order to determine if they need to update the text file, the clients first request the 
text file to examine it.  Their requests go to 2 different tomcats via the load-balancer.
But it does not matter, since they both get the same response text file, since it is 
identical.

Now client A decides to update the file by adding a line XXX to it.
And client B decides to update the file by adding a line YYY to it.
They both POST their request at about the same time to the front-end, and the front-end 
(or whatever replication mechanism) sends each request to all 3 back-end tomcats.

When the 2 POST requests have been processed, what is the state of the 3 text 
files ?




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: TomcatCluster data replication

2011-04-06 Thread Thomas Strauß

Am 06.04.2011 um 22:35 schrieb André Warnier:

 Jürgen Jakobitsch wrote:
 ...
 
 image you have a simple text file in the WEB-INF directory of a webapp named 
 ClusterApp. this ClusterApp is deployed 
 on three tomcats in a cluster. now comes a POST request, that updates the 
 text file (adds one line to it).
 now of course i need to synchronize the text file on all tomcats in the 
 cluster.
 
 Ok, let's imagine there are initially 3 identical simple text files, on each 
 of the 3 tomcats.
 And there are 2 clients accessing the load balancer.
 In order to determine if they need to update the text file, the clients first 
 request the 
 text file to examine it.  Their requests go to 2 different tomcats via the 
 load-balancer.
 But it does not matter, since they both get the same response text file, 
 since it is 
 identical.
 Now client A decides to update the file by adding a line XXX to it.
 And client B decides to update the file by adding a line YYY to it.
 They both POST their request at about the same time to the front-end, and the 
 front-end 
 (or whatever replication mechanism) sends each request to all 3 back-end 
 tomcats.
 When the 2 POST requests have been processed, what is the state of the 3 text 
 files ?
 

I would say this is a classical case for either centralized datastore or 
distributed transaction manager.

To solve the issue with the existing setup, I would possibly serialize the 
write request into a message queue that has one subscriber per cluster member. 
Only the subscriber thread is allowed to write into the file.

A little bit tuning the queue setup will provide you with a fail safe system, 
were a crashing cluster member will recover and continue on his copies of the 
write requests. 

Overall the files should be reasonable equal. If you need realtime updates of 
all cluster members, the datastore you have chosen IMHO sucks  :-) I suppose 
you would need to add a kind of distributed transaction then. This could be 
reached if you send a commit message to all cluster members when you have 
successfully written your data. For each dataset, you expect a commit message 
from all others before you serve it to clients again... sounds a little bit 
like reinventing the wheel.


Regards,
 Thomas

 
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
 For additional commands, e-mail: users-h...@tomcat.apache.org
 
 
 -- 
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.
 


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org