Hello,

Although I'm not sure why urllib2 resulted in successful retrieval of
URLs containing braces I have since come to discover that the issue
here lies not with Google App Engine but rather at dbpedia.  I have
since come to realize that I misdiagnosed the issue and thought I'd
update this thread indicating that.

If anyone is interested, the root cause seems to be a redirect (303
see other) taking place where the url  being redirected to is not
encoded.    A direct request to the redirect URL, with encoding,
retrieves the intended document.

Matthew

On May 5, 11:07 pm, Matt Trinneer <[email protected]> wrote:
> Having some luck...  By using urllib2 instead of urlfetch I am able to
> load the same URLs on the production server without any issue.  Not
> really a solution per say but it gets the job done.  Appreciate
> everyone's feedback.
>
> On May 5, 10:29 pm, Matt Trinneer <[email protected]> wrote:
>
>
>
> > Hi George,
>
> > Thanks for the response.  I've done some additional testing and am not
> > getting much further.  Unfortunately in this case I do not have
> > control of the endpoint and am stuck with braces in the URL.
>
> > Some additional notes which may be of use to anyone who happens upon
> > this:
>
> > 1. The URLs being requested in this example return xml/rdf
> > documents.
> > 2. In the case of requesting a resource without braces in it's URL a
> > response similar to the following is received (truncated for brevity)
>
> > <?xml version="1.0" encoding="utf-8" ?>
> > <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
> > xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";>
> > <rdf:Description rdf:about="http://dbpedia.org/resource/Companion_
> > %28manga%29">.....</rdf:Description>
> > </rdf:RDF>
>
> > 3. On the GAE production environment the response to a request for a
> > URL with braces is not an error, but rather an empty rdf document.
>
> > <?xml version="1.0" encoding="utf-8" ?>
> > <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
> > xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";>
> > </rdf:RDF>
>
> > 4.  This lead me to speculate that the request being received by the
> > remote host was not for the same resource as I believe I am making a
> > request for.  So, with the help of another non-GAE endpoint I have
> > been logging requests generated via urlfetch and am not able to see
> > any appreciable difference between those sent by the development
> > version, where these requests work, and the production version, where
> > they don't.
>
> > Continuing to investigate....
>
> > On May 5, 5:31 am, George <[email protected]> wrote:
>
> > > Ivan, Your problem looks like a common encoding problem. The default
> > > encoding used in server of GAE is ASCII, but something else such as
> > > UTF-8 on your computer. So the code works in your development
> > > environment but not on Google server.
>
> > > To deal with this problem you need to declare the encoding in file
> > > header and decode your string to unicode with the proper charset
> > > before using it. If you don't do this, the Python interpreter will
> > > help you to do it with the system default one. I agree this is a
> > > little confusing. Python should do it more elegantly.
>
> > > For Matthew's problem, sorry I also have no idea about it. urlfetch is
> > > a mystery in GAE libs. I found several examples working good in local
> > > but throwing error on server. So I can only suggest you avoid touching
> > > the dangerous zone like braces in url. :-)
>
> > > --
> > > George
>
> > > App Engine Unit Test Frameworkhttp://code.google.com/p/gaeunit/
>
> > > On May 4, 5:35 pm, Ivan Maslov <[email protected]> wrote:
>
> > > > I have similar problem. On development server function urlencode works
> > > > correctly with unicode string. In production error occurs:
> > > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in 
> > > > position
> > > > 2: ordinal not in range(128). It occurs when i pass russian strings as
> > > > parameter.
>
> > > > 2009/5/4 Matt Trinneer <[email protected]>
>
> > > > > To further that post...
>
> > > > > It seems to me that URLs containing characters such as ( and ) are not
> > > > > being fetched properly on the production environment.  I've attempted
> > > > > escaping the characters, as per RFC 3986.  However the escaped url
> > > > > (http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
> > > > > better.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to