Alexander Shorin created HDFS-9896:
--------------------------------------

             Summary: WebHDFS API may return invalid JSON
                 Key: HDFS-9896
                 URL: https://issues.apache.org/jira/browse/HDFS-9896
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: webhdfs
         Environment: FreeBSD 10.2
Hadoop 2.6.0

            Reporter: Alexander Shorin


{code}
>>> import requests
>>> resp = 
>>> requests.get('http://server:50000/webhdfs/v1/tmp/test/\x00/not_found.txt?op=GETFILESTATUS')
>>> resp.content
'{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
 does not exist: /tmp/test/\x00/not_found.txt"}}'
>>> resp.json()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File 
"/home/sandbox/project/venv/lib/python2.7/site-packages/requests/models.py", 
line 800, in json
    self.content.decode(encoding), **kwargs
  File "/usr/local/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid control character at: line 1 column 147 (char 146)
{code}

The null byte {{\x00}} should be encoded according JSON rules as {{\u0000}}. It 
seems like WebHDFS returns path back as is without any processing breaking the 
content type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to