Creating a file on HDFS is a multi-step process. If you allow me to generalize 
and skip over a lot of details, it's essentially a two step process.    1) ask 
the namenode for a location to write the blocks.   2) connect to the datanode 
and write your data.   The output from your curl statement is the response from 
the namenode, which returns a 307 and a location.   Your client, (curl) is 
supposed to say hey I have a new location and connect to the data node to write 
the data.   If you add -L to your curl request, you'll see this happening.   

Just as a FYI, an example of using httplib for webhdfs is a solved problem.  
You have your pick of languages on github that do this already.  :)  

https://github.com/search?q=webhdfs&type=Repositories&s=updated    

-- Adam

On Apr 9, 2013, at 8:32 AM, Daryn Sharp <[email protected]> wrote:

> Try adding -L to your curl and see if that works.
> 
> Daryn
> 
> On Apr 8, 2013, at 11:05 PM, ??????PHP wrote:
> 
>> Really Thanks.
>> But the returned URL is wrong. And the localhost is the real URL, as i 
>> tested successfully with curl using "localhost".
>> Can anybody help me translate the curl to Python httplib?
>> curl -i -X PUT -T <LOCAL_FILE> 
>> "http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE"
>> I test it using python httplib, and receive the right response. But the file 
>> uploaded to HDFS is empty, no data sent!!
>> Is "conn.send(data)"  the problem?
>> 
>> ------------------ Original ------------------
>> From:  "MARCOS MEDRADO RUBINELLI"<[email protected]>;
>> Date:  Mon, Apr 8, 2013 04:22 PM
>> To:  "[email protected]"<[email protected]>;
>> Subject:  RES: I want to call HDFS REST api to upload a file using httplib.
>> 
>> On your first call, Hadoop will return a URL pointing to a datanode in the 
>> Location header of the 307 response. On your second call, you have to use 
>> that URL instead of constructing your own. You can see the specific 
>> documentation here:
>> http://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
>> 
>> Regards,
>> Marcos
>> 
>> I want to call HDFS REST api to upload a file using httplib.
>> 
>> My program created the file, but no content is in it.
>> 
>> =====================================================
>> 
>> Here is my code:
>> 
>> import
>>  httplib
>> 
>> conn
>> =httplib.HTTPConnection("localhost:50070")
>> 
>> conn
>> .request("PUT","/webhdfs/v1/levi/4?op=CREATE")
>> 
>> res
>> =conn.getresponse()
>> print res.status,res.
>> reason
>> conn
>> .close()
>> 
>> 
>> conn
>> =httplib.HTTPConnection("localhost:50075")
>> 
>> conn
>> .connect()
>> 
>> conn
>> .putrequest("PUT","/webhdfs/v1/levi/4?op=CREATE&user.name=levi")
>> 
>> conn
>> .endheaders()
>> 
>> a_file
>> =open("/home/levi/4","rb")
>> 
>> a_file
>> .seek(0)
>> 
>> data
>> =a_file.read()
>> 
>> conn
>> .send(data)
>> 
>> res
>> =conn.getresponse()
>> print res.status,res.
>> reason
>> conn
>> .close()
>> ==================================================
>> 
>> Here is the return:
>> 
>> 307 TEMPORARY_REDIRECT 201 Created
>> 
>> =========================================================
>> 
>> OK, the file was created, but no content was sent.
>> 
>> When I comment the #conn.send(data), the result is the same, still no 
>> content.
>> 
>> Maybe the file read or the send is wrong, not sure.
>> 
>> Do you know how this happened?
>> 
> 

Reply via email to