Has anyone had tried using swig to wrap libhdfs? I spent some time today doing this, and it seems like it could be a great solution, but its also a fair amount of work (especially having never used swig before). If this seems generally worthwhile I could finish it up.
Or is the thrift interface the API to use? Is anyone successfully using it? I'm primarily interested in building some filesystem management + reporting tools, so being slower than the Java interface is not problematic. I'd prefer to not to parse the command-line tool output though :) --travis On Tue, Aug 10, 2010 at 9:39 AM, Philip Zeyliger <phi...@cloudera.com> wrote: > > > On Tue, Aug 10, 2010 at 5:06 AM, Bjoern Schiessle <bjo...@schiessle.org> > wrote: >> >> Hi Philip, >> >> On Mon, 9 Aug 2010 16:35:07 -0700 Philip Zeyliger wrote: >> > To give you an example of how this may be done, HUE, under the covers, >> > pipes your data to 'bin/hadoop fs -Dhadoop.job.ugi=user,group put - >> > path'. (That's from memory, but it's approximately right; the full >> > python code is at >> > >> > http://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/src/hadoop/fs/hadoopfs.py#L692 >> > ) >> >> Thank you! If I understand it correctly this only works if my python app >> runs on the same server as hadoop, right? > > It works only if your python app has network connectivity to your namenode. > You can access an explicitly specified HDFS by passing > -Dfs.default.name=hdfs://<namenode>:<namenode_port>/ . (The default is read > from hadoop-site.xml (or perhaps hdfs-site.xml), and, I think, defaults to > file:///). >