Hi all. This an announce of a new python wrapper around libhdfs called cyhdfs. cy stands for cython.
PyPI: http://pypi.python.org/pypi/cyhdfs/0.1.2 bitbucket: https://bitbucket.org/turnaev/cyhdfs/src License is bsd. Code development status is beta, but we are using it for 3 month in production in a daemon that receives zmq messages and writes it to hdfs using this lib. So the other api except writing files not so tested in production. The original intention to write it was because i did not found anything that simple to write/read from hdfs from python. Pydoop have hdfs api but it uses boost and we didn't managed to compile it on FreeBSD. Also i wanted something lightweight and simple. I compiled it on freebsd/ubuntu with cloudera hadoop-0.20.2-cdh3u3, hadoop-0.20.2-cdh3u5 and hadoop-1.0.3. To compile it you need cython installed. But if you hadoop version higher than 0.20.2-cdh3u5 there are chances it will compile without cython as cydhfs ships with pregenerated cpp file for those versions. Compilation with hadoop 2.X.X untested as we are not planning to use it. If any person wants to add support of hadoop 2.X.X or just maintain this lib i would gladly give all rights to repository and PyPI to him. I'll be glad if anyone find this lib useful or it will be included in hadoop distribution. -- -------------------------------------------- Турнаев Евгений Викторович --------------------------------------------
