[web2py:10748] Re: URL validation problem (RFC?)

Timothy Farrell Tue, 28 Oct 2008 13:20:11 -0700

Here's the deal.  Currently apache does serve these files directly, but 
that is part of the problem (security).
We have some old PDF code that generates reports and slaps them in a 
network folder.  Due to several design flaws, this old PDF-generation 
system must go the way of the Dodo.  Before I replace it wholesale, I'd 
like to put web2py in the middle of it.  My web2py app works like this:
1) receive request for a particular file
2) Generate equivalent PDF with new generation system (Pisa/ReportLabs)
3) Compare page lengths and file sizes
4) If page lengths are equal, and sizes are similar, serve the new one 
otherwise serve the old one.


This is an interim step to completely replacing the old system.  The 
problem is that I can't access the piece of code that generates the 
hyperlinks.  So I have to make web2py morph to accept the old style.

-tim

mdipierro wrote:
> Is this because of static files for a specific app?
>
> why not have apache serve them directly?
>
> I cannot imagine any other case when this is relevant. Can you give us
> an example?
>
> Massimo
>
>
> On Oct 28, 2:40 pm, Timothy Farrell <[EMAIL PROTECTED]> wrote:
>   
>> I understand your position.  Under normal circumstances, I would agree
>> with you.  But, I just have a situation where I can't control exactly
>> what's coming in and so I need web2py be more lenient.  I'll (have to)
>> run a custom version of web2py until I no longer need to interface with
>> this older system (which is likely to be about a year).
>>
>> -tim
>>
>> mdipierro wrote:
>>     
>>> I disagree. The web2py url is only used inside web2py and I think
>>> web2py should enforce good practice even if it is more strict than
>>> actual specs. We can disagree on what is good practice. For me is when
>>> the url only includes alphanumeric characters, _ , /, and non
>>> consecutive dots. This avoid potential trouble with for example
>>> directory traversal attacks in downloading files.
>>>       
>>> Massimo
>>>       
>>> On Oct 28, 2:13 pm, Timothy Farrell <[EMAIL PROTECTED]> wrote:
>>>       
>>>> Thanks Kyle.
>>>>         
>>>> What I have to say below may be heresy...
>>>>         
>>>> In light of the silence on this subject, I've decided that web2py's URL
>>>> validation (for the purposes of mapping URLs to
>>>> applications/controllers/functions) oversteps its bounds and
>>>> over-zealously restricts (at least for my own purposes).  I've come to
>>>> the opinion that web2py should only validate the portions of the URL
>>>> that it needs to parse in order to run the appropriate function and pass
>>>> the appropriate args.  All other input sanitization should be left to
>>>> the relevant application functions.
>>>>         
>>>> Regarding RFC1738, as I mentioned below, this is meaningless because the
>>>> wsgiserver already unquotes the path before it passes it on to web2py.
>>>>         
>>>> In the practical sense, this means that web2py should only validate the
>>>> first three elements of the path and leave the rest to the application.
>>>> This also leaves an implementation problem with regular expressions, but
>>>> that's another story.
>>>>         
>>>> Opinions? Thoughts? Tomatoes?
>>>>         
>>>> Kyle Smith wrote:
>>>>         
>>>>> You are absolutely correct that it's not the same discussion. I was
>>>>> just trying to point you to previous conversation about url validation
>>>>> in general since it is a similar topic.
>>>>>           
>>>>> Kyle
>>>>>           
>>>>> On Wed, Oct 22, 2008 at 1:50 PM, Timothy Farrell <[EMAIL PROTECTED]
>>>>> <mailto:[EMAIL PROTECTED]>> wrote:
>>>>>           
>>>>>     Thanks for your input, but this is not about the IS_URL
>>>>>     validator.  This is about web2py utterly rejecting any request
>>>>>     that has and apostrophe (or other RFC-valid punctuation) in the
>>>>>     middle of the path.
>>>>>           
>>>>>     -tim
>>>>>           
>>>>>     Kyle Smith wrote:
>>>>>           
>>>>>>     A similar discussion happened shortly after I started using
>>>>>>     web2py. If you read through this thread you can see the
>>>>>>     discussion that Massimo and I had on the topic. You probably want
>>>>>>     to jump down to around message 13 in the thread.
>>>>>>             
>>>>>>    
>>>>>> http://groups.google.com/group/web2py/browse_frm/thread/414723e11c9f9...
>>>>>>     
>>>>>> <http://groups.google.com/group/web2py/browse_frm/thread/414723e11c9f9...>
>>>>>>             
>>>>>>     I currently use my own validator (also not completely RFC1738
>>>>>>     compliant) for parsing urls instead of the built in IS_URL.
>>>>>>             
>>>>>>     Kyle
>>>>>>             
>>>>>>     On Wed, Oct 22, 2008 at 1:21 PM, Timothy Farrell
>>>>>>     <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>>>>>             
>>>>>>         Ugh, I have an issue.
>>>>>>             
>>>>>>         It has come to my attention that the URL validation does not
>>>>>>         conform to RFC1738 (section 2.2 is the most relevant).  This
>>>>>>         is fine for the schema://host/application/controller/function
>>>>>>         part of the URL, but it causes problems in such circumstances
>>>>>>         that I ran into today.  Here are the details:
>>>>>>             
>>>>>>         I made a PDF file pass-through that I access like :
>>>>>>         /init/default/pdfpass/dir/PDF_FILENAME.pdf
>>>>>>             
>>>>>>         I ran into the problem of sometimes a request comes in that
>>>>>>         looks like: /init/default/pdfpass/dir/PDF'FILENAME.pdf
>>>>>>         (notice the apostrophe)
>>>>>>             
>>>>>>         This doesn't play well with the URL validation regexp from
>>>>>>         main.py line 39.  I would like to be able to use normal URL
>>>>>>         characters in my function arguments.
>>>>>>             
>>>>>>         For those with not enough time/patience to read an RFC,
>>>>>>         normal path characters are: letters, numbers, and *$ - _ . +
>>>>>>         ! * ' ( ) ,*  This does not include the special URL path
>>>>>>         characters: */ @ ? : = & ;*
>>>>>>             
>>>>>>         Thoughts?  Can we include these characters without
>>>>>>         compromising security?
>>>>>>             
>>>>  tfarrell.vcf
>>>> < 1KViewDownload
>>>>         
>>
>>  tfarrell.vcf
>> < 1KViewDownload
>>     
> >
>   

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"web2py Web Framework" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/web2py?hl=en
-~----------~----~----~----~------~----~------~--~---

begin:vcard
fn:Timothy Farrell
n:Farrell;Timothy
org:Statewide General Insurance Agency;IT
adr:;;4501 East 31st Street;Tulsa;OK;74135;US
email;internet:[EMAIL PROTECTED]
title:Computer Guy
tel;work:(918)492-1446
url:www.swgen.com
version:2.1
end:vcard

[web2py:10748] Re: URL validation problem (RFC?)

Reply via email to