Re: URL-encoding and "#"

2017-10-15 Thread Alex O'Ree
What was unexpected for me, was that even if the the symbol is URL
encoded, it was still stripped out by tomcat. I understand now
allowing a backslash in a URL, however if it is URL encoded as
%5C then why not allow it? Maybe I'm missing something

On Fri, Oct 13, 2017 at 7:17 AM, i...@flyingfischer.ch
 wrote:
> Am 13.10.2017 um 12:48 schrieb Alex O'Ree:
>> Well that explains a lot. Similar issue for me. With url encoding,  tomcat
>> is dropping back slash and the plus symbol.
>
> While I think it is perfectly eligible to strive for a most perfect
> alignement with standards and specs, I think Tomcat should allow a
> reasonnable set of characters to be optionally allowed (as they already
> are in Tomcat up to 8.5).
>
> I am aware that these options may be a security issue and that the
> documentation should state that clearly. However it is not always
> possible to correct the environment to be "standard" compatible and the
> educational approach by not allowing these options is understandable but
> may be not appropriate in many situations.
>
> Best regards
> Markus
>
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Mark Thomas
On 13/10/17 18:42, André Warnier (tomcat) wrote:
> On 13.10.2017 19:29, Mark Thomas wrote:
>> On 13/10/2017 18:15, André Warnier (tomcat) wrote:
>>> On 13.10.2017 18:17, Mark Thomas wrote:
 On 13/10/2017 17:09, James H. H. Lampert wrote:
> Thanks to all of you who responded.
>
> I found a web page that explains it in ways that I can wrap my
> 55-year-old brain around, and has an easy-to-read reference chart.
>
> https://perishablepress.com/stop-using-unsafe-characters-in-urls/
>
> Question: the problem first showed up on a web service that takes a
> "bodyless" POST operation, and I assume it also applies to GET
> operations, and to the URL portion of a POST with a body.
>
> But what about the body of a POST?

   From an HTTP specification point of view, anything goes.
>>>
>>> With respect, I believe that "anything goes" is a bit imprecise here.
>>
>> Nope.
>>
>> You can POST anything. You are talking specifically about form data.
> 
> Mmm. You are being a bit casuistic here. (Granted, not that I wasn't.)

Yeah, sorry about that. I tend to read "With respect..." as meaning
pretty much exactly the opposite.

> In the real world, I would expect that 99% of what is ever POSTed, /is/
> form data.
> Not you ?

For Tomcat I don't have a clue what the split is but my guess is that is
it a lot less than 99% these days.

>  In
>> that case, as I said, the body has to conform to what the component
>> processing it expects.
> 
> And that component would be .. ?

https://svn.apache.org/viewvc/tomcat/trunk/java/org/apache/tomcat/util/http/Parameters.java?view=annotate

> I don't really know, but I would guess that in most webservers, the
> component parsing the body of a POST with Content-type =
> application/x-www-form-urlencoded, may be the same as the one which is
> parsing the query-string of a URI, no ?
> Considering the similarity of these two things, it would seem that the
> temptation would be hard to resist.

Tomcat uses exactly the same code - with a little wrapping to get the
data into the same format before it starts.

Mark


>> And yes, unicode in form data is 'interesting'...
>>
>> Mark
>>
>>
>>> See e.g. https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4
>>>
>>> There are 2 ways for a user agent to send the content of a HTTP POST :
>>> 1) with Content-type header = application/x-www-form-urlencoded
>>> or
>>> 2) with Content-type header = multipart/form-data
>>>
>>> and while it is true that in the case (2), any submitted key=value pair
>>> would be sent separately 'as is', this would not necessarily be so in
>>> case (1), because then all key=value pairs would be concatenated into
>>> one long string, in which the different key=value pairs would be
>>> separated by (unescaped) "&" signs.
>>> (Apart from other required encodings, see the page above)
>>> So if the client is not a browser, and "composes" itself the POST body
>>> before sending it, and sends it with a Content-type (1), it had better
>>> encode the individual parameter pairs as described, before concatenating
>>> them, because that is what the server would expect.
>>>
>>> As an additional note, if it so happened that the data in the client
>>> could contain Unicode text, do not forget that this is (still) not the
>>> standard in HTTP (and URI's, and thus query-string-like things), and
>>> make sure that you use the proper method to encode any printable
>>> characters which are not purely US-ASCII.  Again, browsers generally do
>>> this correctly, but custom clients not necessarily. (And a "custom
>>> client" in this case, could even be a bit of javascript which is
>>> embedded in one of your own pages, but does its own calls to the server
>>> on the side).
>>>
>>> I just recently got bitten by this, even in a quite recent browser,
>>> where some javascript function was composing a POST to a server (using
>>> type (1) above), and was NOT doing it correctly, even though the page
>>> containing and calling this function was itself declared as
>>> Unicode/UTF-8.
>>> (that was with (and I am too sorely tempted to add "of course" to resist
>>> it) some revision of IE-11 - although other revisions of the same
>>> browser did not exhibit that same issue).
>>>
>>> [...]
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>
> 
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 


-
To unsubscribe, 

Re: URL-encoding and "#"

2017-10-13 Thread James H. H. Lampert

On 10/13/17, 10:50 AM, Igal @ Lucee.org wrote:

On 10/13/2017 10:42 AM, André Warnier (tomcat) wrote:

Mmm. You are being a bit casuistic here. (Granted, not that I wasn't.)
In the real world, I would expect that 99% of what is ever POSTed,
/is/ form data.
Not you ?


10 years ago I would have agreed, but with REST services there are many
APIs that expect POSTed data that does not originate in web forms.


Exactly. And this whole discussion has been about RESTful web services 
(specifically, RESTful web services implemented with Swagger), so form 
data isn't even a consideration (and I'm pretty sure I plugged any 
Swagger-specific or JSON-specific holes in the body encoding months ago).



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Igal @ Lucee.org

On 10/13/2017 10:42 AM, André Warnier (tomcat) wrote:

Mmm. You are being a bit casuistic here. (Granted, not that I wasn't.)
In the real world, I would expect that 99% of what is ever POSTed, 
/is/ form data.

Not you ?


10 years ago I would have agreed, but with REST services there are many 
APIs that expect POSTed data that does not originate in web forms.


Respectfully,

Igal Sapir
Lucee Core Developer
Lucee.org 



Re: URL-encoding and "#"

2017-10-13 Thread tomcat

On 13.10.2017 19:29, Mark Thomas wrote:

On 13/10/2017 18:15, André Warnier (tomcat) wrote:

On 13.10.2017 18:17, Mark Thomas wrote:

On 13/10/2017 17:09, James H. H. Lampert wrote:

Thanks to all of you who responded.

I found a web page that explains it in ways that I can wrap my
55-year-old brain around, and has an easy-to-read reference chart.

https://perishablepress.com/stop-using-unsafe-characters-in-urls/

Question: the problem first showed up on a web service that takes a
"bodyless" POST operation, and I assume it also applies to GET
operations, and to the URL portion of a POST with a body.

But what about the body of a POST?


  From an HTTP specification point of view, anything goes.


With respect, I believe that "anything goes" is a bit imprecise here.


Nope.

You can POST anything. You are talking specifically about form data.


Mmm. You are being a bit casuistic here. (Granted, not that I wasn't.)
In the real world, I would expect that 99% of what is ever POSTed, /is/ form 
data.
Not you ?

 In

that case, as I said, the body has to conform to what the component
processing it expects.


And that component would be .. ?
I don't really know, but I would guess that in most webservers, the component parsing the 
body of a POST with Content-type = application/x-www-form-urlencoded, may be the same as 
the one which is parsing the query-string of a URI, no ?
Considering the similarity of these two things, it would seem that the temptation would be 
hard to resist.




And yes, unicode in form data is 'interesting'...

Mark



See e.g. https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4

There are 2 ways for a user agent to send the content of a HTTP POST :
1) with Content-type header = application/x-www-form-urlencoded
or
2) with Content-type header = multipart/form-data

and while it is true that in the case (2), any submitted key=value pair
would be sent separately 'as is', this would not necessarily be so in
case (1), because then all key=value pairs would be concatenated into
one long string, in which the different key=value pairs would be
separated by (unescaped) "&" signs.
(Apart from other required encodings, see the page above)
So if the client is not a browser, and "composes" itself the POST body
before sending it, and sends it with a Content-type (1), it had better
encode the individual parameter pairs as described, before concatenating
them, because that is what the server would expect.

As an additional note, if it so happened that the data in the client
could contain Unicode text, do not forget that this is (still) not the
standard in HTTP (and URI's, and thus query-string-like things), and
make sure that you use the proper method to encode any printable
characters which are not purely US-ASCII.  Again, browsers generally do
this correctly, but custom clients not necessarily. (And a "custom
client" in this case, could even be a bit of javascript which is
embedded in one of your own pages, but does its own calls to the server
on the side).

I just recently got bitten by this, even in a quite recent browser,
where some javascript function was composing a POST to a server (using
type (1) above), and was NOT doing it correctly, even though the page
containing and calling this function was itself declared as Unicode/UTF-8.
(that was with (and I am too sorely tempted to add "of course" to resist
it) some revision of IE-11 - although other revisions of the same
browser did not exhibit that same issue).

[...]


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Mark Thomas
On 13/10/2017 18:15, André Warnier (tomcat) wrote:
> On 13.10.2017 18:17, Mark Thomas wrote:
>> On 13/10/2017 17:09, James H. H. Lampert wrote:
>>> Thanks to all of you who responded.
>>>
>>> I found a web page that explains it in ways that I can wrap my
>>> 55-year-old brain around, and has an easy-to-read reference chart.
>>>
>>> https://perishablepress.com/stop-using-unsafe-characters-in-urls/
>>>
>>> Question: the problem first showed up on a web service that takes a
>>> "bodyless" POST operation, and I assume it also applies to GET
>>> operations, and to the URL portion of a POST with a body.
>>>
>>> But what about the body of a POST?
>>
>>  From an HTTP specification point of view, anything goes.
> 
> With respect, I believe that "anything goes" is a bit imprecise here.

Nope.

You can POST anything. You are talking specifically about form data. In
that case, as I said, the body has to conform to what the component
processing it expects.

And yes, unicode in form data is 'interesting'...

Mark


> See e.g. https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4
> 
> There are 2 ways for a user agent to send the content of a HTTP POST :
> 1) with Content-type header = application/x-www-form-urlencoded
> or
> 2) with Content-type header = multipart/form-data
> 
> and while it is true that in the case (2), any submitted key=value pair
> would be sent separately 'as is', this would not necessarily be so in
> case (1), because then all key=value pairs would be concatenated into
> one long string, in which the different key=value pairs would be
> separated by (unescaped) "&" signs.
> (Apart from other required encodings, see the page above)
> So if the client is not a browser, and "composes" itself the POST body
> before sending it, and sends it with a Content-type (1), it had better
> encode the individual parameter pairs as described, before concatenating
> them, because that is what the server would expect.
> 
> As an additional note, if it so happened that the data in the client
> could contain Unicode text, do not forget that this is (still) not the
> standard in HTTP (and URI's, and thus query-string-like things), and
> make sure that you use the proper method to encode any printable
> characters which are not purely US-ASCII.  Again, browsers generally do
> this correctly, but custom clients not necessarily. (And a "custom
> client" in this case, could even be a bit of javascript which is
> embedded in one of your own pages, but does its own calls to the server
> on the side).
> 
> I just recently got bitten by this, even in a quite recent browser,
> where some javascript function was composing a POST to a server (using
> type (1) above), and was NOT doing it correctly, even though the page
> containing and calling this function was itself declared as Unicode/UTF-8.
> (that was with (and I am too sorely tempted to add "of course" to resist
> it) some revision of IE-11 - although other revisions of the same
> browser did not exhibit that same issue).
> 
> [...]
> 
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread tomcat

On 13.10.2017 18:17, Mark Thomas wrote:

On 13/10/2017 17:09, James H. H. Lampert wrote:

Thanks to all of you who responded.

I found a web page that explains it in ways that I can wrap my
55-year-old brain around, and has an easy-to-read reference chart.

https://perishablepress.com/stop-using-unsafe-characters-in-urls/

Question: the problem first showed up on a web service that takes a
"bodyless" POST operation, and I assume it also applies to GET
operations, and to the URL portion of a POST with a body.

But what about the body of a POST?


 From an HTTP specification point of view, anything goes.


With respect, I believe that "anything goes" is a bit imprecise here.

See e.g. https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4

There are 2 ways for a user agent to send the content of a HTTP POST :
1) with Content-type header = application/x-www-form-urlencoded
or
2) with Content-type header = multipart/form-data

and while it is true that in the case (2), any submitted key=value pair would be sent 
separately 'as is', this would not necessarily be so in case (1), because then all 
key=value pairs would be concatenated into one long string, in which the different 
key=value pairs would be separated by (unescaped) "&" signs.

(Apart from other required encodings, see the page above)
So if the client is not a browser, and "composes" itself the POST body before sending it, 
and sends it with a Content-type (1), it had better encode the individual parameter pairs 
as described, before concatenating them, because that is what the server would expect.


As an additional note, if it so happened that the data in the client could contain Unicode 
text, do not forget that this is (still) not the standard in HTTP (and URI's, and thus 
query-string-like things), and make sure that you use the proper method to encode any 
printable characters which are not purely US-ASCII.  Again, browsers generally do this 
correctly, but custom clients not necessarily. (And a "custom client" in this case, could 
even be a bit of javascript which is embedded in one of your own pages, but does its own 
calls to the server on the side).


I just recently got bitten by this, even in a quite recent browser, where some javascript 
function was composing a POST to a server (using type (1) above), and was NOT doing it 
correctly, even though the page containing and calling this function was itself declared 
as Unicode/UTF-8.
(that was with (and I am too sorely tempted to add "of course" to resist it) some revision 
of IE-11 - although other revisions of the same browser did not exhibit that same issue).


[...]


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Mark Thomas
On 13/10/2017 17:09, James H. H. Lampert wrote:
> Thanks to all of you who responded.
> 
> I found a web page that explains it in ways that I can wrap my
> 55-year-old brain around, and has an easy-to-read reference chart.
> 
> https://perishablepress.com/stop-using-unsafe-characters-in-urls/
> 
> Question: the problem first showed up on a web service that takes a
> "bodyless" POST operation, and I assume it also applies to GET
> operations, and to the URL portion of a POST with a body.
> 
> But what about the body of a POST?

>From an HTTP specification point of view, anything goes. You are only
limited by whatever rules are imposed by the component that processes
that data.

Mark


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread James H. H. Lampert

Thanks to all of you who responded.

I found a web page that explains it in ways that I can wrap my 
55-year-old brain around, and has an easy-to-read reference chart.


https://perishablepress.com/stop-using-unsafe-characters-in-urls/

Question: the problem first showed up on a web service that takes a 
"bodyless" POST operation, and I assume it also applies to GET 
operations, and to the URL portion of a POST with a body.


But what about the body of a POST?

--
JHHL

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread i...@flyingfischer.ch
Am 13.10.2017 um 12:48 schrieb Alex O'Ree:
> Well that explains a lot. Similar issue for me. With url encoding,  tomcat
> is dropping back slash and the plus symbol.

While I think it is perfectly eligible to strive for a most perfect
alignement with standards and specs, I think Tomcat should allow a
reasonnable set of characters to be optionally allowed (as they already
are in Tomcat up to 8.5).

I am aware that these options may be a security issue and that the
documentation should state that clearly. However it is not always
possible to correct the environment to be "standard" compatible and the
educational approach by not allowing these options is understandable but
may be not appropriate in many situations.

Best regards
Markus

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Alex O'Ree
Well that explains a lot. Similar issue for me. With url encoding,  tomcat
is dropping back slash and the plus symbol.

On Oct 13, 2017 3:01 AM, "Mark Thomas"  wrote:

> On 13/10/2017 07:38, Peter Kreuser wrote:
> > Chris,
> >
> >
> >
> >
> > Peter Kreuser
> >> Am 13.10.2017 um 04:29 schrieb Christopher Schultz <
> w...@christopherschultz.net>:
> >>
> > James,
> >
>  On 10/12/17 8:44 PM, James H. H. Lampert wrote:
>  Question:
> 
>  The application we're developing has a suite of web services
>  (RESTful, Swagger-based), and at least one of them can accept a
>  pound sign ("#") as a URL parameter.
> 
>  Several months ago, with the application and all of its services
>  running on Tomcat 7, it was accepting a plain, naked # in the URL.
>  Now, running on Tomcat 8.5, it's returning an error message
>  ("HTTP/1.1 400").
> >
> > No client should ever send a naked # to a server. It's a violation of
> > the spec, full stop. That isn't to say that Tomcat should fail in any
> > particular way, but Tomcat is well within its rights to say "a # is
> > not allowed in a URL, so this is a bad request".
> >
> >
> >> Nevertheless there is AFAIR a commandline switch to set TC 8.5 to the
> old behavior.
>
> From memory, # isn't one of the allowed exceptions.
>
> The full list of invalid characters in the request line that Tomcat
> started to check for is:
> ' ', '\"', '#', '<', '>', '\\', '^', '`', '{', '|', '}'
>
> The allowed exceptions are (currently) '{', '|', '}'
>
> Mark
>
> >> James, please browse the mail archives.
> >> From a quick look this seems to help, for a short term solution:
> >
> >> https://marc.info/?l=tomcat-user=150183715500537=2
> >
> >> Please nevertheless fix the client, for a better world as Chris pointed
> out ;-P.
> >
> >> Best regards
> >
> >> Peter
> >
>  The developer (in a different time zone) has explained about
>  URL-encoding, but hasn't said whether there was anything in his
>  code to make it stop tolerating the naked # sign.
> 
>  Did the change from Tomcat 7 to Tomcat 8.5 have anything to do
>  with this?
> >
> > Each version of Tomcat gets more and more strict about the garbage it
> > will accept from clients. This is done to improve the world as a
> > whole, and also improve security when it comes to things like
> > converting URL paths into filesystem paths, etc. Strictly speaking,
> > everything should *always* be safe, but it helps to stop The Badness
> > at the earliest opportunity.
> >
>  And if so, are there any other common ASCII characters that used
>  to be accepted as characters, but now have to be URL-encoded?
> > Anything in the URL spec that is allowed should be allowed. Clients
> > should expect that anything not mentioned in the spec would be
> > rejected by a compliant server.
> >
> > -chris
> >>
> >> -
> >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>
> >
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>


Re: URL-encoding and "#"

2017-10-13 Thread i...@flyingfischer.ch
Am 13.10.2017 um 09:01 schrieb Mark Thomas:
> From memory, # isn't one of the allowed exceptions.
>
> The full list of invalid characters in the request line that Tomcat
> started to check for is:
> ' ', '\"', '#', '<', '>', '\\', '^', '`', '{', '|', '}'
>
> The allowed exceptions are (currently) '{', '|', '}'
>
> Mark
By the way:

While fully agreeing, that the '#' character should not be sent by a
client to the server, it still would be desirable to have those three by
optional configuration allowed characters, also be made a configurable
exception in Tomcat 9.

As fas as I know, this option has been allowed up to and including
Tomcat 8.5 only.

While it is a good thing to save the world, real world scenarios may differ.

Thanks for considering.

Markus

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Mark Thomas
On 13/10/2017 07:38, Peter Kreuser wrote:
> Chris,
> 
> 
> 
> 
> Peter Kreuser
>> Am 13.10.2017 um 04:29 schrieb Christopher Schultz 
>> :
>>
> James,
> 
 On 10/12/17 8:44 PM, James H. H. Lampert wrote:
 Question:

 The application we're developing has a suite of web services
 (RESTful, Swagger-based), and at least one of them can accept a
 pound sign ("#") as a URL parameter.

 Several months ago, with the application and all of its services
 running on Tomcat 7, it was accepting a plain, naked # in the URL.
 Now, running on Tomcat 8.5, it's returning an error message
 ("HTTP/1.1 400").
> 
> No client should ever send a naked # to a server. It's a violation of
> the spec, full stop. That isn't to say that Tomcat should fail in any
> particular way, but Tomcat is well within its rights to say "a # is
> not allowed in a URL, so this is a bad request".
> 
> 
>> Nevertheless there is AFAIR a commandline switch to set TC 8.5 to the old 
>> behavior.

>From memory, # isn't one of the allowed exceptions.

The full list of invalid characters in the request line that Tomcat
started to check for is:
' ', '\"', '#', '<', '>', '\\', '^', '`', '{', '|', '}'

The allowed exceptions are (currently) '{', '|', '}'

Mark

>> James, please browse the mail archives.
>> From a quick look this seems to help, for a short term solution:
> 
>> https://marc.info/?l=tomcat-user=150183715500537=2
> 
>> Please nevertheless fix the client, for a better world as Chris pointed out 
>> ;-P.
> 
>> Best regards
> 
>> Peter
> 
 The developer (in a different time zone) has explained about 
 URL-encoding, but hasn't said whether there was anything in his
 code to make it stop tolerating the naked # sign.

 Did the change from Tomcat 7 to Tomcat 8.5 have anything to do
 with this?
> 
> Each version of Tomcat gets more and more strict about the garbage it
> will accept from clients. This is done to improve the world as a
> whole, and also improve security when it comes to things like
> converting URL paths into filesystem paths, etc. Strictly speaking,
> everything should *always* be safe, but it helps to stop The Badness
> at the earliest opportunity.
> 
 And if so, are there any other common ASCII characters that used
 to be accepted as characters, but now have to be URL-encoded?
> Anything in the URL spec that is allowed should be allowed. Clients
> should expect that anything not mentioned in the spec would be
> rejected by a compliant server.
> 
> -chris
>>
>> -
>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>
> 


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: URL-encoding and "#"

2017-10-13 Thread Peter Kreuser
Chris,




Peter Kreuser
> Am 13.10.2017 um 04:29 schrieb Christopher Schultz 
> :
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> James,
> 
>> On 10/12/17 8:44 PM, James H. H. Lampert wrote:
>> Question:
>> 
>> The application we're developing has a suite of web services
>> (RESTful, Swagger-based), and at least one of them can accept a
>> pound sign ("#") as a URL parameter.
>> 
>> Several months ago, with the application and all of its services
>> running on Tomcat 7, it was accepting a plain, naked # in the URL.
>> Now, running on Tomcat 8.5, it's returning an error message
>> ("HTTP/1.1 400").
> 
> No client should ever send a naked # to a server. It's a violation of
> the spec, full stop. That isn't to say that Tomcat should fail in any
> particular way, but Tomcat is well within its rights to say "a # is
> not allowed in a URL, so this is a bad request".
> 

Nevertheless there is AFAIR a commandline switch to set TC 8.5 to the old 
behavior.

James, please browse the mail archives.
From a quick look this seems to help, for a short term solution:

https://marc.info/?l=tomcat-user=150183715500537=2

Please nevertheless fix the client, for a better world as Chris pointed out ;-P.

Best regards

Peter

>> The developer (in a different time zone) has explained about 
>> URL-encoding, but hasn't said whether there was anything in his
>> code to make it stop tolerating the naked # sign.
>> 
>> Did the change from Tomcat 7 to Tomcat 8.5 have anything to do
>> with this?
> 
> Each version of Tomcat gets more and more strict about the garbage it
> will accept from clients. This is done to improve the world as a
> whole, and also improve security when it comes to things like
> converting URL paths into filesystem paths, etc. Strictly speaking,
> everything should *always* be safe, but it helps to stop The Badness
> at the earliest opportunity.
> 
>> And if so, are there any other common ASCII characters that used
>> to be accepted as characters, but now have to be URL-encoded?
> Anything in the URL spec that is allowed should be allowed. Clients
> should expect that anything not mentioned in the spec would be
> rejected by a compliant server.
> 
> - -chris
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlngJRsACgkQHPApP6U8
> pFhqMg//cP4U9z0v8AzkdGRfWJilIAVdsgbA8fdfqTM0f542GzHo4tWidx6F89zK
> y2oVxz9Fr4RQev2Dgr5DyPrJnv2JYufe2S3AxBltA1jQQCu6GnqEjgzxlvmrGY05
> hhrBYBBOgBudgLXcK4bHuoIk+W5ke1Hc1n94WqyVDq2EJZUibKLJLGo3nsAItBcS
> a7jFitbzAQT/0fX/Nzo/LFanNNLenOkoKxZA0KyqzDYiwOGcsLLukOIV1AOiWgEU
> cy4dFhYkixoi8lfs5SjivNknp5tDJSq6Rf3UYChkXUcwQUTVA45AecRWvaEihwjr
> fFN91h9AVKXoVBVNjPYLKS7K7ODahR6oLNqta/2aji4QgCBnyfrPvopIG7e6fbM8
> BYo+MfpbrVi8b7ZL69d2Cl8+/6MmcUbWfuPzZsBm9Mg7tdza13NQ0vin3uyv0y6N
> 73ytO57G1CVfFK3T8v6giEMt6URpBzviF1PK0gTpBImZO13eXYVO5D8E0cXp0Q2d
> cTSC120wgwIhN4tBlrf2asjdut+0K7cpYpuAQVHFCacedhdTxDPR+OoWo4zRoYuI
> 3D776j6OoyxGCmU2GNR9kNK8q3fuVouplCapdRKPPqlbskCzmfb70SjevVGX3sAT
> /OwMwonndlCQoFOob4zg03a2rnKMritVcflffeYmih0Xm+UU7QY=
> =SwD9
> -END PGP SIGNATURE-
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 


Re: URL-encoding and "#"

2017-10-12 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

James,

On 10/12/17 8:44 PM, James H. H. Lampert wrote:
> Question:
> 
> The application we're developing has a suite of web services
> (RESTful, Swagger-based), and at least one of them can accept a
> pound sign ("#") as a URL parameter.
> 
> Several months ago, with the application and all of its services
> running on Tomcat 7, it was accepting a plain, naked # in the URL.
> Now, running on Tomcat 8.5, it's returning an error message
> ("HTTP/1.1 400").

No client should ever send a naked # to a server. It's a violation of
the spec, full stop. That isn't to say that Tomcat should fail in any
particular way, but Tomcat is well within its rights to say "a # is
not allowed in a URL, so this is a bad request".

> The developer (in a different time zone) has explained about 
> URL-encoding, but hasn't said whether there was anything in his
> code to make it stop tolerating the naked # sign.
> 
> Did the change from Tomcat 7 to Tomcat 8.5 have anything to do
> with this?

Each version of Tomcat gets more and more strict about the garbage it
will accept from clients. This is done to improve the world as a
whole, and also improve security when it comes to things like
converting URL paths into filesystem paths, etc. Strictly speaking,
everything should *always* be safe, but it helps to stop The Badness
at the earliest opportunity.

> And if so, are there any other common ASCII characters that used
> to be accepted as characters, but now have to be URL-encoded?
Anything in the URL spec that is allowed should be allowed. Clients
should expect that anything not mentioned in the spec would be
rejected by a compliant server.

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlngJRsACgkQHPApP6U8
pFhqMg//cP4U9z0v8AzkdGRfWJilIAVdsgbA8fdfqTM0f542GzHo4tWidx6F89zK
y2oVxz9Fr4RQev2Dgr5DyPrJnv2JYufe2S3AxBltA1jQQCu6GnqEjgzxlvmrGY05
hhrBYBBOgBudgLXcK4bHuoIk+W5ke1Hc1n94WqyVDq2EJZUibKLJLGo3nsAItBcS
a7jFitbzAQT/0fX/Nzo/LFanNNLenOkoKxZA0KyqzDYiwOGcsLLukOIV1AOiWgEU
cy4dFhYkixoi8lfs5SjivNknp5tDJSq6Rf3UYChkXUcwQUTVA45AecRWvaEihwjr
fFN91h9AVKXoVBVNjPYLKS7K7ODahR6oLNqta/2aji4QgCBnyfrPvopIG7e6fbM8
BYo+MfpbrVi8b7ZL69d2Cl8+/6MmcUbWfuPzZsBm9Mg7tdza13NQ0vin3uyv0y6N
73ytO57G1CVfFK3T8v6giEMt6URpBzviF1PK0gTpBImZO13eXYVO5D8E0cXp0Q2d
cTSC120wgwIhN4tBlrf2asjdut+0K7cpYpuAQVHFCacedhdTxDPR+OoWo4zRoYuI
3D776j6OoyxGCmU2GNR9kNK8q3fuVouplCapdRKPPqlbskCzmfb70SjevVGX3sAT
/OwMwonndlCQoFOob4zg03a2rnKMritVcflffeYmih0Xm+UU7QY=
=SwD9
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: URL encoding problem IIS6 / JK1.2.25 / Tomcat 5.5.20

2008-06-25 Thread Jesse Klaasse
André Warnier wrote:

It would appear (from the logs), that there is some double-encoding of the URI 
going on.
[snip]
But, if somewhere along the line, a piece of code was receiving the encoded 
URI http://.../test%5Bbrackets%5D.jsp;, and decided to re-encode it again 
using the % hex hex method, then you would get this URI : 
http://.../test%255Bbrackets%255D.jsp; (where %25 is the encoded version of 
%).
Then the next step would decode this URI back into 
http://.../test%5Bbrackets%5D.jsp;, and that is what the server would try to 
access, what would be logged, and also what you seem to experience.

So, which is the culprit which re-encodes something it should not ?
And is there not some parameter somewhere which forces it to do so ?

Thanks for your excellent answer. I have fixed the problem now. There is a 
setting for the isapi_redirecor called uri_select. This parameter controls 
the URI's which are passed to Tomcat from IIS. It defaults to value proxy, 
which leads to some re-encoding. I have changed the parameter's value to 
unparsed now, which has solved the problem.

For those who want to know more, the parameter is explained here: 
http://tomcat.apache.org/connectors-doc/reference/iis.html

Jesse.

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: URL encoding problem IIS6 / JK1.2.25 / Tomcat 5.5.20

2008-06-25 Thread André Warnier


Jesse Klaasse wrote:
[...]

Good.
Now, I should add that using [ and ] in URL's is not really something I 
would recommend, if only for legibility reasons.  It will always make 
people wonder if what they're seeing in the logfile is normal, or if 
it's some programming syntax which escaped there.
And I would bet that naming your servlet that way is also going to bring 
you trouble if you ever have to reference it in a Javascript line for 
instance.


André


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: URL encoding problem IIS6 / JK1.2.25 / Tomcat 5.5.20

2008-06-25 Thread André Warnier


Jesse Klaasse wrote:
[...]

Good.
Now, I should add that using [ and ] in URL's is not really something I 
would recommend, if only for legibility reasons.  It will always make 
people wonder if what they're seeing in the logfile is normal, or if 
it's some programming syntax which escaped there.
And I would bet that naming your servlet that way is also going to bring 
you trouble if you ever have to reference it in a Javascript line for 
instance.


André


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: URL encoding problem IIS6 / JK1.2.25 / Tomcat 5.5.20

2008-06-25 Thread Jesse Klaasse
André Warnier wrote:
[...]
Now, I should add that using [ and ] in URL's is not really something I would 
recommend, if only for legibility reasons.  It will always make people wonder 
if what they're seeing in the logfile is normal, or if it's some programming 
syntax which escaped there.
And I would bet that naming your servlet that way is also going to bring you 
trouble if you ever have to reference it in a Javascript line for instance.

I completely agree with you when you're saying [ and ] are not recommended in 
URL's. However, I don't really have a say in that, something to do with 
politics and legacy stuff.. Thanks again for your help!

Jesse.

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]