Hi Vinithra, others, We are using CDH3b3 (which works amazingly well!!). And it's nice see Y!'s Kerberos solution coupled to HDFS. But I wouldn't use Hue to upload a set of files accumulating to 100's of GBs or a number of TB's. Browser-based applications are not suitable for that, in my experience. Do you have different experiences with Hue? (To be fair, we haven't tested its performance yet.)
We are setting up a cluster that will be shared by people from a number of different institutes, all working on different cases with different data. Their work and data should be protected, also from each other. At the same time they need to be able to transfer their data onto HDFS (with a high enough throughput) from their local clusters / machines. Is there a standard that others are using and that works for shared clusters? How are Y! people getting their data onto HDFS? Right now we are using SFTP. We handle authentication a bit 'hacky', but it works: we've coupled our LDAP server to Hue through an Auth*Handler, which also allows for executing a script that updates authentication tokens for our FTP. So far the throughput is far from high enough though - 1.5 MB/s - with data going over the line unencrypted. Unless we can get that up significantly, while providing the option to encrypt the data on the wire, that will probably not be a long term solution. If anybody can share experiences on transparently and securely getting data onto HDFS from external locations, that would be much appreciated! Cheers, Evert ________________________________________ From: Vinithra Varadharajan [vinit...@cloudera.com] Sent: Friday, November 26, 2010 10:12 PM To: gene...@hadoop.apache.org; Evert Lammerts Cc: hdfs-user@hadoop.apache.org; Hue-Users Subject: Re: interfaces to HDFS icw Kerberos + Hue-user mailing list Hi Evert, Which version of Hue and CDH are you using? CDH3b3 includes Yahoo's security patches, which provide Kerberos authentication. In CDH3b3, we have made changes to Hue's filebrowser application, which provides an interface to upload data into HDFS, so that it works with Hadoop's authentication. Is this similar to what you're looking for? Thanks, Vinithra On Thu, Nov 25, 2010 at 11:16 AM, Evert Lammerts <evert.lamme...@sara.nl<mailto:evert.lamme...@sara.nl>> wrote: Hi list, We're considering to provide our users with FTP and WebDAV interfaces (with software provided here: http://www.hadoop.iponweb.net/). These both support user accounts, so we'll be able to deal with authentication. We're evaluating Cloudera's Hue, which we have coupled to our LDAP service for authentication, as an interface to MapReduce. These solutions are not the most beautiful in terms of authentication. We'd much prefer to use Kerberos as provided by Y!. But if we do so, how will we enable users to get data from the outside world onto HDFS? How do others provide secure but easy interfaces to HDFS? Kind regards, Evert Lammerts