Creating a file on HDFS is a multi-step process. If you allow me to generalize and skip over a lot of details, it's essentially a two step process. 1) ask the namenode for a location to write the blocks. 2) connect to the datanode and write your data. The output from your curl statement is the response from the namenode, which returns a 307 and a location. Your client, (curl) is supposed to say hey I have a new location and connect to the data node to write the data. If you add -L to your curl request, you'll see this happening.
Just as a FYI, an example of using httplib for webhdfs is a solved problem. You have your pick of languages on github that do this already. :) https://github.com/search?q=webhdfs&type=Repositories&s=updated -- Adam On Apr 9, 2013, at 8:32 AM, Daryn Sharp <[email protected]> wrote: > Try adding -L to your curl and see if that works. > > Daryn > > On Apr 8, 2013, at 11:05 PM, ??????PHP wrote: > >> Really Thanks. >> But the returned URL is wrong. And the localhost is the real URL, as i >> tested successfully with curl using "localhost". >> Can anybody help me translate the curl to Python httplib? >> curl -i -X PUT -T <LOCAL_FILE> >> "http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE" >> I test it using python httplib, and receive the right response. But the file >> uploaded to HDFS is empty, no data sent!! >> Is "conn.send(data)" the problem? >> >> ------------------ Original ------------------ >> From: "MARCOS MEDRADO RUBINELLI"<[email protected]>; >> Date: Mon, Apr 8, 2013 04:22 PM >> To: "[email protected]"<[email protected]>; >> Subject: RES: I want to call HDFS REST api to upload a file using httplib. >> >> On your first call, Hadoop will return a URL pointing to a datanode in the >> Location header of the 307 response. On your second call, you have to use >> that URL instead of constructing your own. You can see the specific >> documentation here: >> http://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE >> >> Regards, >> Marcos >> >> I want to call HDFS REST api to upload a file using httplib. >> >> My program created the file, but no content is in it. >> >> ===================================================== >> >> Here is my code: >> >> import >> httplib >> >> conn >> =httplib.HTTPConnection("localhost:50070") >> >> conn >> .request("PUT","/webhdfs/v1/levi/4?op=CREATE") >> >> res >> =conn.getresponse() >> print res.status,res. >> reason >> conn >> .close() >> >> >> conn >> =httplib.HTTPConnection("localhost:50075") >> >> conn >> .connect() >> >> conn >> .putrequest("PUT","/webhdfs/v1/levi/4?op=CREATE&user.name=levi") >> >> conn >> .endheaders() >> >> a_file >> =open("/home/levi/4","rb") >> >> a_file >> .seek(0) >> >> data >> =a_file.read() >> >> conn >> .send(data) >> >> res >> =conn.getresponse() >> print res.status,res. >> reason >> conn >> .close() >> ================================================== >> >> Here is the return: >> >> 307 TEMPORARY_REDIRECT 201 Created >> >> ========================================================= >> >> OK, the file was created, but no content was sent. >> >> When I comment the #conn.send(data), the result is the same, still no >> content. >> >> Maybe the file read or the send is wrong, not sure. >> >> Do you know how this happened? >> >
