running hadoop with gij

2008-07-17 Thread Gert Pfeifer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Did anyone try to get hadoop running on the Gnu java environment? Does
that work?

Cheers,
Gert
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iQIVAwUBSH8wy/4RHiapZN5BAQLUEhAAwEmmA97HC0R+gZ1SOwxV/FwWZQmO3Y9v
CKzFVqQPPf2bxlpb5lED2K0+F0QstZDtslZ6i4cNH6s+amFYCgZhqdEU1djqQXdY
yxhYZ6FgS0+J9jdpU8b14uPv5IN23VPKa5MruJycMzH3WZnFsFK604QFstuvZQe1
P4no1By/yTJNQkWBfJ0drjPPc8lIMK1K99/z6WfmrgVJAL616YB+wIce+ssXAyWP
GUFscmHzE908NsJDrKqEYt9+dWr28MdBsgUI54ORDNYRh/0xvQixOe1T1HQn7w1O
XFmkKVmLN66FGn220tR9f+KaXBbcN6BwF0xTmVh5NVaWtJnqARQoU6qJEmwm/A1m
mS55OT+fYr9esb0ASIY4lSkfTeLNVrMsjbmMRw6QSx1a6BprTO+qHNHWlfxgMHr+
bg8NLAF8XmjrjgmuX90J9yFsZIPlnLRoLWNAlttm6ODZMp59+ogD80anBvQ8hTi+
52VX2Cagf78+Dismaxy0ykxQkexRfdqCAlAcvnbPqERhdzNeEWdB9c76ZBPiQuOz
WE+95jmb0MaiAhebTlSSze5GPpAqvX/b6crqffp3jDsN82mY1zQKxF38IX4CWtXy
3+H49CeKe8RP5n7hWjyJWiHqyQRV4v517g2qGfh2meQJESnIR4JwC4uohow01TiL
HNwAU/mqAp4=
=DIEZ
-END PGP SIGNATURE-


[email recall] ROME 1.0 RC1 disted

2008-07-17 Thread Alejandro Abdelnur
Apologies for the my previous email, somehow my email aliases got
screwed up, this should have gone to the ROME alias.

Alejandro

On Wed, Jul 16, 2008 at 10:43 PM, Alejandro Abdelnur [EMAIL PROTECTED] wrote:
 I've updated ROME wiki, uploaded the javadocs and dists, tagging CVS
 at the moment.

 Would somebody double check everything is OK, links and downloads?

 Also we should get the JAR in a maven repo. Who had the necessary
 permissions to do?

 We can do a final 1.0 release in a couple of weeks, if no issues arise
 retagging RC1 as 1.0. Also it would give time for any subproject
 (Fetcher ?) that wants to go 1.0 in tandem.

 Cheers.

 A



Re: running hadoop with gij

2008-07-17 Thread Matt Kent
There be dragons. Use the Sun JVM.

On Thu, 2008-07-17 at 13:45 +0200, Gert Pfeifer wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Did anyone try to get hadoop running on the Gnu java environment? Does
 that work?
 
 Cheers,
 Gert
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.6 (GNU/Linux)
 
 iQIVAwUBSH8wy/4RHiapZN5BAQLUEhAAwEmmA97HC0R+gZ1SOwxV/FwWZQmO3Y9v
 CKzFVqQPPf2bxlpb5lED2K0+F0QstZDtslZ6i4cNH6s+amFYCgZhqdEU1djqQXdY
 yxhYZ6FgS0+J9jdpU8b14uPv5IN23VPKa5MruJycMzH3WZnFsFK604QFstuvZQe1
 P4no1By/yTJNQkWBfJ0drjPPc8lIMK1K99/z6WfmrgVJAL616YB+wIce+ssXAyWP
 GUFscmHzE908NsJDrKqEYt9+dWr28MdBsgUI54ORDNYRh/0xvQixOe1T1HQn7w1O
 XFmkKVmLN66FGn220tR9f+KaXBbcN6BwF0xTmVh5NVaWtJnqARQoU6qJEmwm/A1m
 mS55OT+fYr9esb0ASIY4lSkfTeLNVrMsjbmMRw6QSx1a6BprTO+qHNHWlfxgMHr+
 bg8NLAF8XmjrjgmuX90J9yFsZIPlnLRoLWNAlttm6ODZMp59+ogD80anBvQ8hTi+
 52VX2Cagf78+Dismaxy0ykxQkexRfdqCAlAcvnbPqERhdzNeEWdB9c76ZBPiQuOz
 WE+95jmb0MaiAhebTlSSze5GPpAqvX/b6crqffp3jDsN82mY1zQKxF38IX4CWtXy
 3+H49CeKe8RP5n7hWjyJWiHqyQRV4v517g2qGfh2meQJESnIR4JwC4uohow01TiL
 HNwAU/mqAp4=
 =DIEZ
 -END PGP SIGNATURE-



Re: Namenode Exceptions with S3

2008-07-17 Thread Doug Cutting

Tom White wrote:

You can allow S3 as the default FS, it's just that then you can't run
HDFS at all in this case. You would only do this if you don't want to
use HDFS at all, for example, if you were running a MapReduce job
which read from S3 and wrote to S3.


Can't one work around this by using a different configuration on the 
client than on the namenodes and datanodes?  The client should be able 
to set fs.default.name to an s3: uri, while the namenode and datanode 
must have it set to an hdfs: uri, no?


Would it be useful to add command-line options to namenode and datanode 
that override the configuration, so that one could start non-default 
HDFS daemons?



It might be less confusing if the HDFS daemons didn't use
fs.default.name to define the namenode host and port. Just like
mapred.job.tracker defines the host and port for the jobtracker,
dfs.namenode.address (or similar) could define the namenode. Would
this be a good change to make?


Probably.  For back-compatibility we could leave it empty by default, 
deferring to fs.default.name, only if folks specify a non-empty 
dfs.namenode.address would it be used.


Doug


Re: running hadoop with gij

2008-07-17 Thread Andreas Kostyrka
On Thursday 17 July 2008 13:45:15 Gert Pfeifer wrote:
 Did anyone try to get hadoop running on the Gnu java environment? Does
 that work?

Considering how stable it runs on plain standard Sun JVM, I'd reserve the gij 
task for the next monthly meeting of masochists anonymous.

Andreas


 Cheers,
 Gert




signature.asc
Description: This is a digitally signed message part.


Re: Namenode Exceptions with S3

2008-07-17 Thread Tom White
On Thu, Jul 17, 2008 at 6:16 PM, Doug Cutting [EMAIL PROTECTED] wrote:
 Can't one work around this by using a different configuration on the client
 than on the namenodes and datanodes?  The client should be able to set
 fs.default.name to an s3: uri, while the namenode and datanode must have it
 set to an hdfs: uri, no?

Yes, that's a good solution.

 It might be less confusing if the HDFS daemons didn't use
 fs.default.name to define the namenode host and port. Just like
 mapred.job.tracker defines the host and port for the jobtracker,
 dfs.namenode.address (or similar) could define the namenode. Would
 this be a good change to make?

 Probably.  For back-compatibility we could leave it empty by default,
 deferring to fs.default.name, only if folks specify a non-empty
 dfs.namenode.address would it be used.

I've opened https://issues.apache.org/jira/browse/HADOOP-3782 for this.

Tom


can hadoop read files backwards

2008-07-17 Thread Elia Mazzawi
is there a way to have hadoop hand over the lines of a file backwards to 
my mapper ?


as in give the last line first.


Re: can hadoop read files backwards

2008-07-17 Thread Jim R. Wilson
It sounds to me like you're talking about hadoop streaming (correct me
if I'm wrong there).  In that case, there's really no order to the
lines being doled out as I understand it.  Any given line could be
handed to any given mapper task running on any given node.

I may be wrong, of course, someone closer to the project could give
you the right answer in that case.

-- Jim R. Wilson (jimbojw)

On Thu, Jul 17, 2008 at 4:06 PM, Elia Mazzawi
[EMAIL PROTECTED] wrote:
 is there a way to have hadoop hand over the lines of a file backwards to my
 mapper ?

 as in give the last line first.



Data locality with CompositeInputFormat

2008-07-17 Thread Christian Kunz
When specifying multiple input directories for the CompositeInputFormat,
is there any deterministic selection where to the tasks are put (data
locality)?
Any preference for running rack-local or node-local to the splits of
first/last input directory?

Thanks,
-Christian



Restricting Job Submission Access

2008-07-17 Thread Theocharis Ian Athanasakis
What's the recommended way to restrict access to job submissions and
HDFS access, besides a firewall?

Thanks,

Ian.


Re: Restricting Job Submission Access

2008-07-17 Thread Allen Wittenauer



On 7/17/08 3:33 PM, Theocharis Ian Athanasakis [EMAIL PROTECTED] wrote:

 What's the recommended way to restrict access to job submissions and
 HDFS access, besides a firewall?

We basically put bastion hosts (we call them gateways) next to hadoop
that users use to submit jobs, access the HDFS, etc.  By limiting who can
get onto the gateways, we limit access.  We also use HOD, so we have all of
Torque's access and resource control capabilities as well.

Not a replacement for real security, obviously.

Oh, I think there might be some diagrams, pictures and other info  about
this in my preso on the hadoop wiki.