Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "IdeasOnLdapConfiguration" page has been changed by SomeOtherAccount.
http://wiki.apache.org/hadoop/IdeasOnLdapConfiguration

--------------------------------------------------

New page:
This is for HADOOP:5670.

First, a bit about LDAP (extremely simplified).

All objects in LDAP are defined by one or more object classes.  These object 
classes define the attributes that a given object can utilize.  The attributes 
in turn have definitions that determine what kinds of values can be used.  This 
includes things as string, integer, etc, but also whether or not the attribute 
can hold more than one value.  Object classes, attributes, and values are all 
defined in a schema definition.

One key feature around LDAP is the ability to search objects using a simple, 
RPN-style system.  Let's say we have an object class that has this definition:

objectclass: node
hostname: string
domain: string

and in our LDAP server, we have placed the following objects:

hostname=myhost1
objectclass=node
domain=example.com

hostname=myhost2
objectclass=node
domain=example.com

We can now do an LDAP search with (&(objectclass=node)(hostname=myhost1)) to 
find the 'myhost1' object.  Similarly, we can 
(&(objectless)(domain=example.com)) to find both myhost1 and myhost2 objects.

Let's apply these ideas to Hadoop.  Here are some rough objectclasses that we 
can use for demonstration purposes:

generic properties: hadoopGlobalConfig
hadoop.tmp.dir: string
fs.default.name: string
dfs.block.size: integer
dfs.replication: integer
clusterName: string

For datanodes: hadoopDataNode
hostname:  multi-string
dfs.data.dir: multi-string
dfs.datanode.du.reserved: integer
commonname: string

hadoopTaskTracker
commonname: string
hostname: multi-string
mapred.job.tracker: string
mapred.local.dir: multi-string
mapred.tasktracker.map.tasks.maximum: integer
mapred.tasktracker.reduce.tasks.maximum: integer

hadoopJobTracker
hostname:
mapred.reduce.tasks: integer
mapred.reduce.slowstart.completed.maps: numeric
mapred.queue.names: multi-string
mapred.jobtracker.taskScheduler: string
mapred.system.dir: string

For the namenode: hadoopNameNode
commonname: string
dfs.http.address: string
hostname: string
dfs.name.dir: multi-string

Let's define a simple grid:

clusterName=red
objectclass=hadoopGlobalConfig
hadoop.tmp.dir=/tmp
fs.default.name: hdfs://namenode:9000/
dfs.block.size: 128
dfs.replication: 3

commonname=master,cluster=red
objectclass=hadoopNameNode,hadoopJobTracker
dfs.http.address: http://masternode:50070/
hostname: masternode
dfs.name.dir: /nn1,/nn2
mapred.reduce.tasks: 1
mapred.reduce.slowstart.completed.maps: .55
mapred.queue.names: big,small
mapred.jobtracker.taskScheduler: capacity
mapred.system.dir: /system/mapred

commonname=simplecomputenode,cluster=red
objectclass=hadoopDataNode,hadoopTaskTracker
hostname:  node1,node2,node3
dfs.data.dir: /hdfs1, /hdfs2, /hdfs3
dfs.datanode.du.reserved: 10
mapred.job.tracker: commonname=jobtracker,cluster=red
mapred.local.dir: /mr1,/mr2,/mr3
mapred.tasktracker.map.tasks.maximum: 4
mapred.tasktracker.reduce.tasks.maximum: 4

Reply via email to