Re: mod_jk: plus-character causes %-encoding problems

2010-01-15 Thread Tero Karttunen
 The '+' char has no special meaning in HTTP/1.1 (RFC 2616) [1], so in
 the path part of the URL it just means itself, the plus sign.

 Any bug in either mod_alias or mod_jk could be proven with regard to the
 above statement by simply changing the URL from:

 http://localhost/sites/one%2Bone%3Cthree
 to:
 http://localhost/sites?one%2Bone%3Cthree

Whaddya know - the test results show that Konstantin is right! The
query mark makes all the difference, and incidently, '+' seems to be
the only character that is being treated differently by the decoder.

When presented with a query
http://localhost/sites/one%2Bone%3Cthree?one%2Bone%3Cthree
mod_alias responds with a redirect to
http://localhost/contextroot/subcontext/sites/one+one%3cthree?one%2Bone%3Cthree
,
leaving the query part intact.

mod_jk also behaves in a similar fashion with +ForwardURIProxy. A request to
http://localhost/contextroot/subcontext/sites/one%2Bone%3Cthree?one%2Bone%3Cthree
shows up as
http://localhost/contextroot/subcontext/sites/one+one%3Cthree?one%2Bone%3Cthree
in Tomcat's logs.

Furthermore org.apache.tomcat.util.buf.UDecoder and
org.apache.catalina.util.RequestUtil classes seem to have special
handling logic for query parts which enable this kind of checks:
if (b == '+'  isQuery) {
b = (byte)' ';
} else if (b == '%') {
b = (byte) ((convertHexDigit(bytes[ix++])  4)
+ convertHexDigit(bytes[ix++]));
}
I have not yet actually tested this in practice, but look like the
facts speak for themselves. I still cannot find this functionality in
specs, but it looks like I don't have to. :-) Good job, Konstantin!

In the end, the only real bug here seems to reside my application's
use of Apache Commons URLCodec. I will have to replace that with some
other solution that respects the difference between path and query
parts. I should be able to use +ForwardURIProxy as well. It may be a
different solution than I expected and it does not make a difference
with %2B and '+' characters, but it should work and be spec- (or at
least reality-) compliant. :-)

I will let you know whether it solves the problem.

Tero Karttunen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod_jk: plus-character causes %-encoding problems

2010-01-14 Thread Tero Karttunen
 Why is '+' decoded to ' ' in the path part of the URL?
 That is, I think, wrong.

This is an interesting theory. If true, it could provide an
explanation to the observed behavior, but I cannot completely follow
it.

 The '+' char has no special meaning in HTTP/1.1 (RFC 2616) [1], so in
 the path part of the URL it just means itself, the plus sign.

On the other hand, the same RFC provides a counter-example. Look at
section 3.2.3 URI comparison. It says that characters other than
those in the reserved and unsafe sets are equivalent to their
%-encoded counterparts. The reserved set as defined in RFC 2396 (and
the later RFC 3986 that obsoletes it) include '+' character.

I believe the chapter 3.2.3 means that the characters in the reserved
set are not equivalent to their %-encoded counterparts, and in this
way, /contextroot/subcontext/sites/one+one%3cfive IS NOT equivalent to
/contextroot/subcontext/sites/one%2Bone%3cfive when doing URI
comparison.

 It is the HTML Forms spec [2] that makes it special, defining
 urlencoding used when submitting web forms through HTTP. It has
 special meaning only in the query part of the URL and only because of
 that part of HTML spec.

HTML Forms spec does define www-form-urlencoding, but I can't tell
from the spec whether it is limited to just the query part.

 What my application actually sees after decoding: sites/one onefive

 What is your application code here? Where and how do you obtain the
 decoded value?

I am using Apache Commons URLCodec to decode the URL. This widely-used
utility class does not make the distinction between path and query
parts...

Let me explain my application to you before I provide the code example
to you. As you could guess from its name TeamCenterEmulator, my
application emulates a set of former URLs, continuing to serve the
pre-existing links while the legacy application is retired.

My application is configured with a CSV file containing a mapping
between an URL and a resource it is supposed to serve (in a dynamic
fashion, it is not a simple file). Say, the application could contain
the following mapping:

former urlresponse
/sites/foofile1
/sites/barfile2
/sites/one%2Bone%cthree   file3
/sites/one%2Bone%cfour file4
/sites/one%2Bone%cfive  file5
...

Once the application initializes, it reads the mapping into memory,
and if the request matches the former url EXACTLY, the matching
response is returned. This is the application spec. Note here that by
RFC 2616-compliant URI comparison, my application must regard request
/sites/one+one%cfive as a non-match!

Here is doGet from my servlet. Note that I am trimming the URL to
start from the sites part for obvious reasons...

protected void doGet(HttpServletRequest request, HttpServletResponse
response) throws ServletException, IOException {
super.doGet(request, response);
if (config == null) {
config = new
ConfigurationFactory().createConfiguration(getServletContext().getInitParame
ter(teamCenterURLMapping));
}
String urlSnippet = (getServletContext().getContextPath() +
/ + getServletConfig().getServletName() + /);
String url = ;
if (request.getRequestURI().length()  urlSnippet.length())
{
url =
request.getRequestURI().substring(urlSnippet.length());
}
   try {
TeamCenterConfigurationItem item =
config.findByURL(url);
[...]
   catch (UnknownUrlException) {
   ...
   }
}

I am not going to post ConfigurationFactory, because it is not
interesting. It basically builds a HashMap based on the CSV file that
has URLCodec.decode()'d former urls as its keys, with the idea that if
we URL-decode the incoming request, we can search the HashMap for
matches.

Here is how the abovementioned findUrl method does just that:

public TeamCenterConfigurationItem findByURL (String url) throws
UnknownUrlException {
URLCodec codec = new URLCodec(UTF8);
try {
url = codec.decode(url);
logger.info(url);
} catch (DecoderException e) {
logger.error(e);
throw new UnknownUrlException (url);
}
if (config.containsKey(url)) {
return config.get(url);
}
throw new UnknownUrlException (url);
}

What do you think? Is my approach valid? Am I somehow abusing
URLCodec? Should the request be (partially) decoded in some other way?

Best Regards,
Tero Karttunen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod_jk: plus-character causes %-encoding problems

2010-01-14 Thread Tero Karttunen
 Is UTF-8 the reason why you are using your custom decoding?
 [...]
 You should be able to use HttpServletRequest.getPathInfo() to get the
 decoded value.

Not really. I could probably be using getPathInfo() for getting the
decoded request. But note that I am also decoding both the former urls
from the CSV input file and comparing the result with the decoded
request, so in any case I have to use some custom decoder in my code.
In these kinds of situations, I reflexively do the decoding with the
same decoder to make absolutely sure that the decoded strings match
for identical inputs. Otherwise if I let the servlet container use its
decoder for requests and I use my own custom decoder to decode the
mapping table, I may run into trouble unnecessary and - in the worst
case - make my application application server-specific. Of course,
such a thing could never happen with Tomcat! :-)

Thanks for pointing this out!

Tero Karttunen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod_jk: plus-character causes %-encoding problems

2010-01-13 Thread Tero Karttunen
 in trying to find out what is going on, as is RFC 3986.

All this is making my head hurt, but what I guess is going on is that
the original URL (still available as r-unparsed_uri) is being decoded
in Apache HTTPD at a very early stage, and once mod_jk or other
dispatchers activate, the r-uri they handle can already be a result
of multiple URI manipulations by mod_rewrite and other modules, and
for that reason it can be considered unsafe to mindlessly re-encode
some of its reserved characters. But this is only my first guess.

+ForwardUriCompatUnparsed solves the mod_jk part of the problem _for
me_, but while HTTPD people are working on bug 32328 (since 2007),
could it be benecifial for mod_jk to maybe offer a fifth Forwarding
mode as a workaround for the problem for mod_jk users? Maybe taking a
list of characters to be encoded as an extra argument?

Unfortunately, I still have no ideas on how to configure the URL
redirection for Apache HTTPD so that the plus-characters are preserved
in encoded format. Does anyone have any ideas or hints?

Thanks for help!

Tero Karttunen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



mod_jk: plus-character causes %-encoding problems

2010-01-11 Thread Tero Karttunen
 redirects). I have not found a similar
workaround for mod_alias yet.

(mod_rewrite does have the [B] option, encode backreferences, but my
brief experiments with [B,R] failed miserably with the results being
one%252bone%253csix and similar double-encoded garbage).

How should I change my configuration so that
http://localhost/sites/one%2Bone%3Cthree gives the same same results
as http://localhost:8082/contextroot/subcontext/sites/one%2Bone%3Cthree?

Best Regards,
Tero Karttunen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Feedback on Tomcat Client Deployer

2009-12-08 Thread Tero Karttunen
I have something that must be a pretty standard J2EE application
server environment: I am running (at least) three Tomcat instances
sharing a common Catalina home with mod_jk load balancer in front. I
have not set up clustering, because my GWT-based applications maintain
state information client-side and do not utilize server-side J2EE Web
Application context sessions.

In this configuration I regularly need to deploy or re-deploy new
applications to all the Tomcat instances, and I was surprised to find
out that the support for this out-of-box was quite poor. All I could
find was Tomcat Client Deployer package, and all the documentation I
could find was the chapter four in the Tomcat user guide. I apologize
in advance if I have missed out something; prove me wrong!

Firstly, the build.xml file does not suffice for documentation for the
ant-tasks! I had to RTFS to find out the needed parameters for the
deploy task (localWar and config), and there seems to be an
ant-task JkStatusUpdate that is completely undocumented. It seems to
be related to Tomcat-5.5 (according to source code comments) so I have
left it alone for now.

Secondly, the example ant-script is only suitable for deploying to one
Tomcat manager instance at a time, which is a major limitation. I have
to write my own ant script for deployment to multiple manager
instances.

But this is not all - I am also going to need to integrate mod_jk
Status Worker ant tasks to my script, because I need to temporarily
disable workers while I undeploy and re-deploy applications to them.
(Otherwise the users may get unnecessary errors.) No problem here - I
can use updateworker task, but here is a major issue:

I am unable to find tomcat-jkstatus-ant.jar binaries anywhere!

The mod_jk binaries directory
/dist/tomcat/tomcat-connectors/jk/binaries/win32/jk-1.2.28 only has
the Apache HTTPD module available for downloading. Moreover, the ant
example uses pathelement location ../dist/tomcat-jkstatus-ant.jar
which seems incomprehensible to me.

I downloaded the source package in order to compile the jar file
myself, but it was not trivial and would have required some effort. I
resigned to googling for the package elsewhere in some unofficial
location (http://www.java2s.com/Code/Jar/STUVWXYZ/tomcat-jkstatus-ant.jar.htm).

My humble suggestions are:
1) Please document the TCD ant task parameters so that the user does
not have to resort to source code
2) Please make it easier to locate tomcat-jkstatus-ant.jar

Let me apologize again if I am on the wrong track, because you are
producing excellent software that contantly exceeds my expectations!
It is only because of this very high standard that I am giving this
feedback, because so far the support for application deployment has
not been as extensive as the rest of your software.

Best Regards,
Tero Karttunen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



mod_jk: How to configure separate failover for different JkMounts?

2009-11-23 Thread Tero Karttunen
BACKGROUND INFORMATION:
I have used mod_jk to configure Apache to work as a load balancer for
two Tomcat server instances. To these Tomcat instances, I have
deployed two Web Applications, ts_core_virtual_repository and pum.
These Web Applications are actually simple servlets that DO NOT use
J2EE sessions, so even though I want to retain support for sticky
sessions for future purposes, that is not necessary yet.

I have set up failover for my Web Applications by setting the
following in worker.properties for the loadbalancer workers:

worker.template.fail_on_status=500

This effectually means that any ServletExceptions that the Web
Applications throw cause failover to happen: the worker moves to ERR
state and the request gets transparently forwarded to the next
available worker. My stateless servlets expect and are prepared for
this!

THE CONFIGURATION PROBLEM:
Should ts_core_virtual_repository application fail by throwing
ServletException, the loadbalancer also interprets pum application
as having failed and starts to forward its request to other workers. I
would like the loadbalancer to treat the applications individually for
500 Internal Servlet Error failover purposes. What would be the best
way to do this?

Although we are not short of machine resources, the solution should
not be unnecessarily wasteful and silly - for example, I would NOT
like to create a set of totally new, separate Tomcat server instances
for different applications. Who knows, there might be a third or
fourth web application in the future, so the solution should be
somewhat scalable and maintainable.

MY CURRENT CONFIGURATION:

httpd.conf:
LoadModule jk_module modules/mod_jk-1.2.28-httpd-2.2.3.so
JkWorkersFile conf/ts_tomcat-workers.properties
JkLogFile logs/mod_jk.log
JkLogLevel info
JkLogStampFormat [%a %b %d %H:%M:%S %Y]
JkMount /ts_core_virtual_repository/* loadbalancer
JkMount /jkstatus/* jkstatus
JkMount /pum/* loadbalancer

ts_tomcat-worker.properties:
worker.list=loadbalancer,jkstatus
worker.template.type=ajp13
worker.template.host=localhost
worker.template.port=8110
worker.template.lbfactor=1
worker.template.connection_pool_timeout=600
worker.template.socket_keepalive=true
worker.template.socket_timeout=10
worker.template.ping_mode=A
worker.template.ping_timeout=4000
worker.template.fail_on_status=500
worker.worker1.reference=worker.template
worker.worker1.port=8110
worker.worker2.reference=worker.template
worker.worker2.port=8111
worker.jkstatus.type=status
worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=worker1,worker2
worker.loadbalancer.sticky_session=true
worker.loadbalancer.sticky_session_force=false
worker.loadbalancer.recover_time=60
worker.loadbalancer.error_escalation_time=0

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org