Hi Vinithra, others,

We are using CDH3b3 (which works amazingly well!!). And it's nice see Y!'s 
Kerberos solution coupled to HDFS. But I wouldn't use Hue to upload a set of 
files accumulating to 100's of GBs or a number of TB's. Browser-based 
applications are not suitable for that, in my experience. Do you have different 
experiences with Hue? (To be fair, we haven't tested its performance yet.)

We are setting up a cluster that will be shared by people from a number of 
different institutes, all working on different cases with different data. Their 
work and data should be protected, also from each other. At the same time they 
need to be able to transfer their data onto HDFS (with a high enough 
throughput) from their local clusters / machines. Is there a standard that 
others are using and that works for shared clusters? How are Y! people getting 
their data onto HDFS?

Right now we are using SFTP. We handle authentication a bit 'hacky', but it 
works: we've coupled our LDAP server to Hue through an Auth*Handler, which also 
allows for executing a script that updates authentication tokens for our FTP. 
So far the throughput is far from high enough though - 1.5 MB/s - with data 
going over the line unencrypted. Unless we can get that up significantly, while 
providing the option to encrypt the data on the wire, that will probably not be 
a long term solution.

If anybody can share experiences on transparently and securely getting data 
onto HDFS from external locations, that would be much appreciated!

Cheers,
Evert

________________________________________
From: Vinithra Varadharajan [vinit...@cloudera.com]
Sent: Friday, November 26, 2010 10:12 PM
To: gene...@hadoop.apache.org; Evert Lammerts
Cc: hdfs-user@hadoop.apache.org; Hue-Users
Subject: Re: interfaces to HDFS icw Kerberos

+ Hue-user mailing list

Hi Evert,

Which version of Hue and CDH are you using? CDH3b3 includes Yahoo's security 
patches, which provide Kerberos authentication. In CDH3b3, we have made changes 
to Hue's filebrowser application, which provides an interface to upload data 
into HDFS, so that it works with Hadoop's authentication. Is this similar to 
what you're looking for?

Thanks,
Vinithra

On Thu, Nov 25, 2010 at 11:16 AM, Evert Lammerts 
<evert.lamme...@sara.nl<mailto:evert.lamme...@sara.nl>> wrote:
Hi list,

We're considering to provide our users with FTP and WebDAV interfaces (with
software provided here: http://www.hadoop.iponweb.net/). These both support
user accounts, so we'll be able to deal with authentication. We're
evaluating Cloudera's Hue, which we have coupled to our LDAP service for
authentication, as an interface to MapReduce.

These solutions are not the most beautiful in terms of authentication. We'd
much prefer to use Kerberos as provided by Y!. But if we do so, how will we
enable users to get data from the outside world onto HDFS? How do others
provide secure but easy interfaces to HDFS?

Kind regards,
Evert Lammerts


Reply via email to