HttpAuthentication.html is out of date (HADOOP 10550) - some questions and proposed fix

2014-05-22 Thread Pankti Majmudar
Hi,
I searched for Hadoop Http Authentication on google and came across a web
page which lists the required steps:
http://hadoop.apache.org/docs/r1.0.4/HttpAuthentication.html

Should the fix for this bug be to update the document with the steps listed
on the above webpage?

Thanks.


Re: HADOOP_ROOT_LOGGER

2014-05-22 Thread Robert Rati
Ah, that makes sense.  Would it make sense to default the root logger to 
the one defined in log4j.properties file instead of the static value in 
the script then?  That way an admin can set all logging properties 
desired in the log4j.properties file, but can override with 
HADOOP_ROOT_LOGGER to debug.


It feels a little black box-y that if HADOOP_ROOT_LOGGER isn't set then 
the root logger set in log4j.properties is ignored.


Maybe this is all very well known and just a bit black box-y to me since 
I'm new-ish to hadoop.


Rob

On 05/22/2014 03:41 PM, Colin McCabe wrote:

It's not always practical to edit the log4j.properties file.  For one
thing, if you're using a management system, there may be many log4j
properties sprinkled around the system, and it could be difficult to figure
out which is the one you need to edit.  For another, you may not (should
not?) have permission to do this on a production cluster.

Doing something like "HADOOP_ROOT_LOGGER="DEBUG,console" hadoop fs -cat
/foo" has helped me diagnose problems in the past.

best,
Colin


On Thu, May 22, 2014 at 6:34 AM, Robert Rati  wrote:


In my experience the default HADOOP_ROOT_LOGGER definition will override
any root logger defined in log4j.properties, which is where the problems
have arisen.  If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were
removed, wouldn't the root logger defined in the log4j.properties file be
used?  Or do the client commands not read that configuration file?

I'm trying to understand why the root logger should be defined outside of
the log4j.properties file.

Rob


On 05/22/2014 12:53 AM, Vinayakumar B wrote:


Hi Robert,

I understand your confusion.

HADOOP_ROOT_LOGGER is set to default value "INFO,console" if it hasn't set
for anything and logs will be displayed on the console itself.
This will be true for any client commands you run. For ex: "hdfs dfs -ls
/"

But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc)
   HADOOP_ROOT_LOGGER will be set to "INFO, RFA" if HADOOP_ROOT_LOGGER env
variable is not defined.
So that all the log messages of the server daemons goto some log files and
this will be maintained by RollingFileAppender. If you want to override
all
these default and set your own loglevel then define that as env variable
HADOOP_ROOT_LOGGER.

For ex:
 export HADOOP_ROOT_LOGGER="DEBUG,RFA"
export above env variable and then start server scripts or execute
client
commands, all logs goto files and will be maintained by
RollingFileAppender.


Regards,
Vinay


On Wed, May 21, 2014 at 6:42 PM, Robert Rati  wrote:

  I noticed in hadoop-config.sh there is this line:


HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_
ROOT_LOGGER:-INFO,console}"

which is setting a root logger if HADOOP_ROOT_LOGGER isn't set.  Why is
this here.needed?  There is a log4j.properties file provided that
defines a
default logger.  I believe the line above will result in overriding
whatever is set for the root logger in the log4j.properties file.  This
has
caused some confusion and hacks to work around this.

Is there a reason not to remove the above code and just have all the
logger definitions in the log4j.properties file?  Is there maybe a
compatibility concern?

Rob










Re: HADOOP_ROOT_LOGGER

2014-05-22 Thread Colin McCabe
It's not always practical to edit the log4j.properties file.  For one
thing, if you're using a management system, there may be many log4j
properties sprinkled around the system, and it could be difficult to figure
out which is the one you need to edit.  For another, you may not (should
not?) have permission to do this on a production cluster.

Doing something like "HADOOP_ROOT_LOGGER="DEBUG,console" hadoop fs -cat
/foo" has helped me diagnose problems in the past.

best,
Colin


On Thu, May 22, 2014 at 6:34 AM, Robert Rati  wrote:

> In my experience the default HADOOP_ROOT_LOGGER definition will override
> any root logger defined in log4j.properties, which is where the problems
> have arisen.  If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were
> removed, wouldn't the root logger defined in the log4j.properties file be
> used?  Or do the client commands not read that configuration file?
>
> I'm trying to understand why the root logger should be defined outside of
> the log4j.properties file.
>
> Rob
>
>
> On 05/22/2014 12:53 AM, Vinayakumar B wrote:
>
>> Hi Robert,
>>
>> I understand your confusion.
>>
>> HADOOP_ROOT_LOGGER is set to default value "INFO,console" if it hasn't set
>> for anything and logs will be displayed on the console itself.
>> This will be true for any client commands you run. For ex: "hdfs dfs -ls
>> /"
>>
>> But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc)
>>   HADOOP_ROOT_LOGGER will be set to "INFO, RFA" if HADOOP_ROOT_LOGGER env
>> variable is not defined.
>> So that all the log messages of the server daemons goto some log files and
>> this will be maintained by RollingFileAppender. If you want to override
>> all
>> these default and set your own loglevel then define that as env variable
>> HADOOP_ROOT_LOGGER.
>>
>> For ex:
>> export HADOOP_ROOT_LOGGER="DEBUG,RFA"
>>export above env variable and then start server scripts or execute
>> client
>> commands, all logs goto files and will be maintained by
>> RollingFileAppender.
>>
>>
>> Regards,
>> Vinay
>>
>>
>> On Wed, May 21, 2014 at 6:42 PM, Robert Rati  wrote:
>>
>>  I noticed in hadoop-config.sh there is this line:
>>>
>>> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_
>>> ROOT_LOGGER:-INFO,console}"
>>>
>>> which is setting a root logger if HADOOP_ROOT_LOGGER isn't set.  Why is
>>> this here.needed?  There is a log4j.properties file provided that
>>> defines a
>>> default logger.  I believe the line above will result in overriding
>>> whatever is set for the root logger in the log4j.properties file.  This
>>> has
>>> caused some confusion and hacks to work around this.
>>>
>>> Is there a reason not to remove the above code and just have all the
>>> logger definitions in the log4j.properties file?  Is there maybe a
>>> compatibility concern?
>>>
>>> Rob
>>>
>>>
>>
>>
>>


Re: getCounters NPE

2014-05-22 Thread Jay Vyas
(sorry, i meant THROW a NPE, not " return a null).  Big difference of
course !


On Thu, May 22, 2014 at 1:36 PM, Jay Vyas  wrote:

> Hi hadoop ...  Is there a reason why line 220, below, should ever return
> null when
> being called through the code path of job.getCounters() ?
>
> It appears that there could be some indirection involving RPCs, etc, so im
> thinking its best to ask before I attempt to trace the call.
>
> 217 @Override
> 218 public GetCountersResponse getCounters(GetCountersRequest request)
> 219 throws IOException {
> 220   JobId jobId = request.getJobId();
> 221   Job job = verifyAndGetJob(jobId);
> 222   GetCountersResponse response =
> recordFactory.newRecordInstance(GetCountersResponse.class);
> 223   response.setCounters(TypeConverter.toYarn(job.getAllCounters()));
> 224   return response;
> 225 }
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>



-- 
Jay Vyas
http://jayunit100.blogspot.com


getCounters NPE

2014-05-22 Thread Jay Vyas
Hi hadoop ...  Is there a reason why line 220, below, should ever return
null when
being called through the code path of job.getCounters() ?

It appears that there could be some indirection involving RPCs, etc, so im
thinking its best to ask before I attempt to trace the call.

217 @Override
218 public GetCountersResponse getCounters(GetCountersRequest request)
219 throws IOException {
220   JobId jobId = request.getJobId();
221   Job job = verifyAndGetJob(jobId);
222   GetCountersResponse response =
recordFactory.newRecordInstance(GetCountersResponse.class);
223   response.setCounters(TypeConverter.toYarn(job.getAllCounters()));
224   return response;
225 }


-- 
Jay Vyas
http://jayunit100.blogspot.com


Struggling New Developer!

2014-05-22 Thread Ghunaim, Hussam
Hello everyone,


After I studied a course in Hadoop and Mapreduce and wrote a couple of
basic jobs, I tried to participate in developing these technologies into
its next generation. However, when I looked through some of the submitted
patches, I realized that I still miss a lot to be at the required level. I
read a couple of textbooks but all of them provided high level descriptions
and none of them provided in depth discussions of the Hadoop and Mapreduce
operations.


Please guys, how can I improve my level up to the required level to
participate effectively in this forum?! Thanks a lot in advance.


Regards,

Hussam


[jira] [Created] (HADOOP-10626) Limit Returning Attributes for LDAP search

2014-05-22 Thread Jason Hubbard (JIRA)
Jason Hubbard created HADOOP-10626:
--

 Summary: Limit Returning Attributes for LDAP search
 Key: HADOOP-10626
 URL: https://issues.apache.org/jira/browse/HADOOP-10626
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.3.0
Reporter: Jason Hubbard


When using Hadoop Ldap Group mappings in an enterprise environment, searching 
groups and returning all members can take a long time causing a timeout.  This 
causes not all groups to be returned for a user.  Because the first search only 
searches for the user dn and the second search retrieves the group member 
attribute, we only need to return the group member attribute on the search 
speeding up the search.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: HADOOP_ROOT_LOGGER

2014-05-22 Thread Robert Rati
In my experience the default HADOOP_ROOT_LOGGER definition will override 
any root logger defined in log4j.properties, which is where the problems 
have arisen.  If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh 
were removed, wouldn't the root logger defined in the log4j.properties 
file be used?  Or do the client commands not read that configuration file?


I'm trying to understand why the root logger should be defined outside 
of the log4j.properties file.


Rob

On 05/22/2014 12:53 AM, Vinayakumar B wrote:

Hi Robert,

I understand your confusion.

HADOOP_ROOT_LOGGER is set to default value "INFO,console" if it hasn't set
for anything and logs will be displayed on the console itself.
This will be true for any client commands you run. For ex: "hdfs dfs -ls /"

But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc)
  HADOOP_ROOT_LOGGER will be set to "INFO, RFA" if HADOOP_ROOT_LOGGER env
variable is not defined.
So that all the log messages of the server daemons goto some log files and
this will be maintained by RollingFileAppender. If you want to override all
these default and set your own loglevel then define that as env variable
HADOOP_ROOT_LOGGER.

For ex:
export HADOOP_ROOT_LOGGER="DEBUG,RFA"
   export above env variable and then start server scripts or execute client
commands, all logs goto files and will be maintained by RollingFileAppender.


Regards,
Vinay


On Wed, May 21, 2014 at 6:42 PM, Robert Rati  wrote:


I noticed in hadoop-config.sh there is this line:

HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_
ROOT_LOGGER:-INFO,console}"

which is setting a root logger if HADOOP_ROOT_LOGGER isn't set.  Why is
this here.needed?  There is a log4j.properties file provided that defines a
default logger.  I believe the line above will result in overriding
whatever is set for the root logger in the log4j.properties file.  This has
caused some confusion and hacks to work around this.

Is there a reason not to remove the above code and just have all the
logger definitions in the log4j.properties file?  Is there maybe a
compatibility concern?

Rob







[jira] [Resolved] (HADOOP-8446) make hadoop-core jar OSGi friendly

2014-05-22 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-8446.


Resolution: Duplicate

> make hadoop-core jar OSGi friendly
> --
>
> Key: HADOOP-8446
> URL: https://issues.apache.org/jira/browse/HADOOP-8446
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Freeman Fang
>
> hadoop-core isn't OSGi friendly, so for those who wanna use it in OSGi 
> container, must wrap it with tool like bnd/maven-bundle-plugin. Apache 
> Servicemix always wrap 3rd party jars which isn't OSGi friendly,  you can see 
> we've done it for lots of jars here[1], more specifically for several 
> hadoop-core versions[2].  Though we may keep this way doing it, the problem 
> is that we need do it for every new released version for 3rd party jars, more 
> importantly we need ensure other Apache projects communities are aware of 
> we're doing it.
> In Servicemix we just wrap hadoop-core 1.0.3, issues to track it in 
> Servicemix is[3].
> We hope Apache Hadoop can offer OSGi friendly jars, in most cases, it's 
> should be straightforward, as it just need add OSGi metadata headers to 
> MANIFEST.MF, this could be done easily with maven-bundle-plugin if build with 
> maven.  There's also some other practice should be followed like different 
> modules shouldn't share same package(avoid split pacakge). 
> thanks
> [1]http://repo2.maven.org/maven2/org/apache/servicemix/bundles
> [2]http://repo2.maven.org/maven2/org/apache/servicemix/bundles/org.apache.servicemix.bundles.hadoop-core/
> [3]https://issues.apache.org/jira/browse/SMX4-1147



--
This message was sent by Atlassian JIRA
(v6.2#6252)