Re: Best way to write files to hdfs (from a Python app)

Travis Crawford Tue, 10 Aug 2010 21:18:28 -0700

Has anyone had tried using swig to wrap libhdfs?

I spent some time today doing this, and it seems like it could be a
great solution, but its also a fair amount of work (especially having
never used swig before). If this seems generally worthwhile I could
finish it up.


Or is the thrift interface the API to use? Is anyone successfully using it?

I'm primarily interested in building some filesystem management +
reporting tools, so being slower than the Java interface is not
problematic. I'd prefer to not to parse the command-line tool output
though :)

--travis



On Tue, Aug 10, 2010 at 9:39 AM, Philip Zeyliger <phi...@cloudera.com> wrote:
>
>
> On Tue, Aug 10, 2010 at 5:06 AM, Bjoern Schiessle <bjo...@schiessle.org>
> wrote:
>>
>> Hi Philip,
>>
>> On Mon, 9 Aug 2010 16:35:07 -0700 Philip Zeyliger wrote:
>> > To give you an example of how this may be done, HUE, under the covers,
>> > pipes your data to 'bin/hadoop fs -Dhadoop.job.ugi=user,group put -
>> > path'. (That's from memory, but it's approximately right; the full
>> > python code is at
>> >
>> > http://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/src/hadoop/fs/hadoopfs.py#L692
>> > )
>>
>> Thank you! If I understand it correctly this only works if my python app
>> runs on the same server as hadoop, right?
>
> It works only if your python app has network connectivity to your namenode.
>  You can access an explicitly specified HDFS by passing
> -Dfs.default.name=hdfs://<namenode>:<namenode_port>/ .  (The default is read
> from hadoop-site.xml (or perhaps hdfs-site.xml), and, I think, defaults to
> file:///).
>

Re: Best way to write files to hdfs (from a Python app)

Reply via email to