Re: Solaris Port SOLVED!

2014-12-16 Thread Steve Loughran
On 16 December 2014 at 16:01, malcolm  wrote:

> 1. Findbugs , 3 warnings in Java code (which of course I did not touch)
> 2. Test failures also with no connection to terror: A java socket timeout,
>

ongoing issues with (1) transition to java 7 builds and (2) some
intermittent tests that need to get fixed. ignore them

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Solaris Port SOLVED!

2014-12-16 Thread Charles Lamb

On 12/16/2014 11:01 AM, malcolm wrote:

This is weird, Jenkins complains about:

1. Findbugs , 3 warnings in Java code (which of course I did not touch)


The FB warnings seem to be a recent phenomenon. I have seen them on a 
recent test run of my own and they come and go depending on the run. I 
think they can be safely ignored. However, if you want to be sure, then 
you could do the findbugs run on your local machine both with and 
without your patch applied and compare the results. If you find that 
there's no difference, then just put a comment in the Jira stating that.


2. Test failures also with no connection to terror: A java socket 
timeout,
Yes, probably unrelated. To be sure, run those same tests on your local 
machine and if they pass, then put a comment in the Jira saying that 
they run on your local machine. If they fail, then run them with and 
without the patch to make sure they fail both ways.


Charles



Re: Solaris Port SOLVED!

2014-12-16 Thread malcolm

This is weird, Jenkins complains about:

1. Findbugs , 3 warnings in Java code (which of course I did not touch)
2. Test failures also with no connection to terror: A java socket timeout,

As a newbie, I am not quite sure how to relate to this.
(I could just revert the code back, and see if I get the same errors 
anyway.)


On 12/16/2014 06:57 AM, malcolm wrote:

Done, and added the comment as you requested.
I attached a second patch file to the JIRA (with .002 appended as per 
convention) assuming Jenkins knows to take the latest version, since I 
understand that I cannot remove the previous patch file .


On 12/16/2014 04:12 AM, Colin McCabe wrote:

Thanks, Malcom.  I reviewed it.  The only thing you still have to do
is hit "submit patch" to get a Jenkins run.  See our HowToContribute
wiki page for more details.

wiki.apache.org/hadoop/HowToContribute

best,
Colin

On Sat, Dec 13, 2014 at 9:22 PM, malcolm 
 wrote:

I am checking on the latest release of Solaris 11 and yes, it is still
thread safe (or MT Safe as documented on the man page).

strerror checks the error code, and returns the same "unknown error" 
string
as terror does, if it receives an invalid code. I checked this on 
Windows,

Solaris and Linux (though my changes only affect Solaris platforms).

JIRA newbie question:

I have filed the JIRA attaching the patch  HADOOP-11403 against the 
trunk,

asking for reviewers in the comments section.
Is there any other protocol I should follow ?

Thanks,
Malcolm


On 12/14/2014 01:08 AM, Asokan, M wrote:

Malcom,
 That's great! Is strerror() thread-safe in the recent version of
Solaris?  In any case, to be correct you still need to make sure 
that the
code passed to strerror() is a valid one.  For this you need to 
check errno
after the call to strerror().  Please check the code snippet I sent 
earlier

for HPUX.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is 
MT-Safe

! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as 
follows:


  const char* terror(int errnum)
  {

  #if defined(__sun)
 return strerror(errnum); //  MT-Safe under Solaris
  #else
 if ((errnum < 0) || (errnum >= sys_nerr)) {
   return "unknown error.";
 }
 return sys_errlist[errnum];
  #endif
  }

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access 
instead

with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though 
and

quite specific to Gnu.

Intialization of the static errlist array should be thread safe i.e.
initially the array is nulled out, and afterwards if two threads 
write

to the same address, then they would be writing the same string.

But if we are ok with changing 5 files, not just terror, then I would
just remove terror completely and use strerror_r (or the alternatives
for Windows and HP_UX) in the caller code instead i.e. using your
suggestion for a local buffer in the caller, wherever needed. The 
more

I think about it, the more this seems to be the right thing to do.

Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
  Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop 
seems

to handle error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop 
how to

file a jira

FormatMessage|awk -F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c 





$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
-F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c 




/home/asokan/work/hadoop/hadoop-trunk/hadoop

Re: Solaris Port SOLVED!

2014-12-15 Thread malcolm

Done, and added the comment as you requested.
I attached a second patch file to the JIRA (with .002 appended as per 
convention) assuming Jenkins knows to take the latest version, since I 
understand that I cannot remove the previous patch file .


On 12/16/2014 04:12 AM, Colin McCabe wrote:

Thanks, Malcom.  I reviewed it.  The only thing you still have to do
is hit "submit patch" to get a Jenkins run.  See our HowToContribute
wiki page for more details.

wiki.apache.org/hadoop/HowToContribute

best,
Colin

On Sat, Dec 13, 2014 at 9:22 PM, malcolm  wrote:

I am checking on the latest release of Solaris 11 and yes, it is still
thread safe (or MT Safe as documented on the man page).

strerror checks the error code, and returns the same "unknown error" string
as terror does, if it receives an invalid code. I checked this on Windows,
Solaris and Linux (though my changes only affect Solaris platforms).

JIRA newbie question:

I have filed the JIRA attaching the patch  HADOOP-11403 against the trunk,
asking for reviewers in the comments section.
Is there any other protocol I should follow ?

Thanks,
Malcolm


On 12/14/2014 01:08 AM, Asokan, M wrote:

Malcom,
 That's great! Is strerror() thread-safe in the recent version of
Solaris?  In any case, to be correct you still need to make sure that the
code passed to strerror() is a valid one.  For this you need to check errno
after the call to strerror().  Please check the code snippet I sent earlier
for HPUX.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe
! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as follows:

  const char* terror(int errnum)
  {

  #if defined(__sun)
 return strerror(errnum); //  MT-Safe under Solaris
  #else
 if ((errnum < 0) || (errnum >= sys_nerr)) {
   return "unknown error.";
 }
 return sys_errlist[errnum];
  #endif
  }

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access instead
with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and
quite specific to Gnu.

Intialization of the static errlist array should be thread safe i.e.
initially the array is nulled out, and afterwards if two threads write
to the same address, then they would be writing the same string.

But if we are ok with changing 5 files, not just terror, then I would
just remove terror completely and use strerror_r (or the alternatives
for Windows and HP_UX) in the caller code instead i.e. using your
suggestion for a local buffer in the caller, wherever needed. The more
I think about it, the more this seems to be the right thing to do.

Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
  Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems
to handle error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop how to
file a jira

FormatMessage|awk -F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c



$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
-F: '{print $1}'|sort -u

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c


/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c



This means y

Re: Solaris Port SOLVED!

2014-12-15 Thread Colin McCabe
Thanks, Malcom.  I reviewed it.  The only thing you still have to do
is hit "submit patch" to get a Jenkins run.  See our HowToContribute
wiki page for more details.

wiki.apache.org/hadoop/HowToContribute

best,
Colin

On Sat, Dec 13, 2014 at 9:22 PM, malcolm  wrote:
> I am checking on the latest release of Solaris 11 and yes, it is still
> thread safe (or MT Safe as documented on the man page).
>
> strerror checks the error code, and returns the same "unknown error" string
> as terror does, if it receives an invalid code. I checked this on Windows,
> Solaris and Linux (though my changes only affect Solaris platforms).
>
> JIRA newbie question:
>
> I have filed the JIRA attaching the patch  HADOOP-11403 against the trunk,
> asking for reviewers in the comments section.
> Is there any other protocol I should follow ?
>
> Thanks,
> Malcolm
>
>
> On 12/14/2014 01:08 AM, Asokan, M wrote:
>>
>> Malcom,
>> That's great! Is strerror() thread-safe in the recent version of
>> Solaris?  In any case, to be correct you still need to make sure that the
>> code passed to strerror() is a valid one.  For this you need to check errno
>> after the call to strerror().  Please check the code snippet I sent earlier
>> for HPUX.
>>
>> -- Asokan
>> 
>> From: malcolm [malcolm.kaval...@oracle.com]
>> Sent: Saturday, December 13, 2014 3:13 PM
>> To: common-dev@hadoop.apache.org
>> Subject: Re: Solaris Port SOLVED!
>>
>> Wiping egg off face  ...
>>
>> After consulting with the Solaris team (and looking at the source code
>> and man page) ,  it turns out that strerror itself on Solaris is MT-Safe
>> ! (Just like HPUX)
>>
>> So, after all this effort, all I need to do is modify terror as follows:
>>
>>  const char* terror(int errnum)
>>  {
>>
>>  #if defined(__sun)
>> return strerror(errnum); //  MT-Safe under Solaris
>>  #else
>> if ((errnum < 0) || (errnum >= sys_nerr)) {
>>   return "unknown error.";
>> }
>> return sys_errlist[errnum];
>>  #endif
>>  }
>>
>> And in two other files where sys_errlist is referenced directly
>> (NativeIO and hdfs_http_client.c), I replaced this direct access instead
>> with a call to terror.
>>
>> Thanks for all your help and patience,
>>
>> I'll file a JIRA asap,
>>
>> Cheers,
>> Malcolm
>>
>> On 12/13/2014 05:26 PM, malcolm wrote:
>>>
>>> Thanks Asokan,
>>>
>>> Looked up Gcc's thread local variables, seems a bit complex though and
>>> quite specific to Gnu.
>>>
>>> Intialization of the static errlist array should be thread safe i.e.
>>> initially the array is nulled out, and afterwards if two threads write
>>> to the same address, then they would be writing the same string.
>>>
>>> But if we are ok with changing 5 files, not just terror, then I would
>>> just remove terror completely and use strerror_r (or the alternatives
>>> for Windows and HP_UX) in the caller code instead i.e. using your
>>> suggestion for a local buffer in the caller, wherever needed. The more
>>> I think about it, the more this seems to be the right thing to do.
>>>
>>> Cheers,
>>> Malcolm
>>>
>>>
>>> On 12/13/2014 04:38 PM, Asokan, M wrote:
>>>>
>>>> Malcom,
>>>>  Gcc supports thread-local variables. See
>>>>
>>>> https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html
>>>>
>>>> I am not sure about native compilers on Solaris, HPUX, or AIX.
>>>>
>>>> In any case, I found out that the Windows native code in Hadoop seems
>>>> to handle error messages properly. Here is what I found:
>>>>
>>>> $ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop how to
>>>> file a jira
>>>>
>>>> FormatMessage|awk -F: '{print $1}'|sort -u
>>>>
>>>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
>>>>
>>>>
>>>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c
>>>>
>>>>
>>>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/sr

Re: Solaris Port SOLVED!

2014-12-13 Thread malcolm
I am checking on the latest release of Solaris 11 and yes, it is still 
thread safe (or MT Safe as documented on the man page).


strerror checks the error code, and returns the same "unknown error" 
string as terror does, if it receives an invalid code. I checked this on 
Windows, Solaris and Linux (though my changes only affect Solaris 
platforms).


JIRA newbie question:

I have filed the JIRA attaching the patch  HADOOP-11403 against the 
trunk, asking for reviewers in the comments section.

Is there any other protocol I should follow ?

Thanks,
Malcolm

On 12/14/2014 01:08 AM, Asokan, M wrote:

Malcom,
That's great! Is strerror() thread-safe in the recent version of Solaris?  
In any case, to be correct you still need to make sure that the code passed to 
strerror() is a valid one.  For this you need to check errno after the call to 
strerror().  Please check the code snippet I sent earlier for HPUX.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe
! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as follows:

 const char* terror(int errnum)
 {

 #if defined(__sun)
return strerror(errnum); //  MT-Safe under Solaris
 #else
if ((errnum < 0) || (errnum >= sys_nerr)) {
  return "unknown error.";
}
return sys_errlist[errnum];
 #endif
 }

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access instead
with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and
quite specific to Gnu.

Intialization of the static errlist array should be thread safe i.e.
initially the array is nulled out, and afterwards if two threads write
to the same address, then they would be writing the same string.

But if we are ok with changing 5 files, not just terror, then I would
just remove terror completely and use strerror_r (or the alternatives
for Windows and HP_UX) in the caller code instead i.e. using your
suggestion for a local buffer in the caller, wherever needed. The more
I think about it, the more this seems to be the right thing to do.

Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
 Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems
to handle error messages properly. Here is what I found:

$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grephadoop how to file a 
jira
FormatMessage|awk -F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c



$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
-F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c



This means you need not worry about the Windows version of terror().
You need to change five files that contain UNIX specific native code.

I have a question on your suggested implementation:

How do you initialize the static errlist array in a thread-safe manner?


Here is another thread-safe implementation that I could come up with:

#include 
#include 
#include 
#include 

#define MESSAGE_BUFFER_SIZE 256

char * getSystemErrorMessage(char * buf, int buf_len, int code) {
#if defined(_HPUX_SOURCE)
char * msg;
errno = 0;
msg = strerror(code);
   

RE: Solaris Port SOLVED!

2014-12-13 Thread Asokan, M
Malcom,
   That's great! Is strerror() thread-safe in the recent version of Solaris?  
In any case, to be correct you still need to make sure that the code passed to 
strerror() is a valid one.  For this you need to check errno after the call to 
strerror().  Please check the code snippet I sent earlier for HPUX.

-- Asokan

From: malcolm [malcolm.kaval...@oracle.com]
Sent: Saturday, December 13, 2014 3:13 PM
To: common-dev@hadoop.apache.org
Subject: Re: Solaris Port SOLVED!

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe
! (Just like HPUX)

So, after all this effort, all I need to do is modify terror as follows:

const char* terror(int errnum)
{

#if defined(__sun)
   return strerror(errnum); //  MT-Safe under Solaris
#else
   if ((errnum < 0) || (errnum >= sys_nerr)) {
 return "unknown error.";
   }
   return sys_errlist[errnum];
#endif
}

And in two other files where sys_errlist is referenced directly
(NativeIO and hdfs_http_client.c), I replaced this direct access instead
with a call to terror.

Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:
> Thanks Asokan,
>
> Looked up Gcc's thread local variables, seems a bit complex though and
> quite specific to Gnu.
>
> Intialization of the static errlist array should be thread safe i.e.
> initially the array is nulled out, and afterwards if two threads write
> to the same address, then they would be writing the same string.
>
> But if we are ok with changing 5 files, not just terror, then I would
> just remove terror completely and use strerror_r (or the alternatives
> for Windows and HP_UX) in the caller code instead i.e. using your
> suggestion for a local buffer in the caller, wherever needed. The more
> I think about it, the more this seems to be the right thing to do.
>
> Cheers,
> Malcolm
>
>
> On 12/13/2014 04:38 PM, Asokan, M wrote:
>> Malcom,
>> Gcc supports thread-local variables. See
>>
>> https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html
>>
>> I am not sure about native compilers on Solaris, HPUX, or AIX.
>>
>> In any case, I found out that the Windows native code in Hadoop seems
>> to handle error messages properly. Here is what I found:
>>
>> $ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep
>> FormatMessage|awk -F: '{print $1}'|sort -u
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c
>>
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c
>>
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c
>>
>>
>>
>> $ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk
>> -F: '{print $1}'|sort -u
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c
>>
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c
>>
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c
>>
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c
>>
>> /home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c
>>
>>
>>
>> This means you need not worry about the Windows version of terror().
>> You need to change five files that contain UNIX specific native code.
>>
>> I have a question on your suggested implementation:
>>
>> How do you initialize the static errlist array in a thread-safe manner?
>>
>> 
>> Here is another thread-safe implementation that I could come up with:
>>
>> #include 
>> #include 
>> #include 
>> #include 
>>
>> #define MESSAGE_BUFFER_SIZE 256
>>
>> char * getSystemErrorMessage(char * buf, int buf_len, int code) {
>> #if defined(_HPUX_SOURCE)
>>char * msg;
>>errno = 0;
>>msg = strerror(code);
>>if (errno == 0) {
>>  strncpy(buf, msg, buf_len-1);
>> 

Re: Solaris Port SOLVED!

2014-12-13 Thread malcolm

Wiping egg off face  ...

After consulting with the Solaris team (and looking at the source code 
and man page) ,  it turns out that strerror itself on Solaris is MT-Safe 
! (Just like HPUX)


So, after all this effort, all I need to do is modify terror as follows:

   const char* terror(int errnum)
   {

   #if defined(__sun)
  return strerror(errnum); //  MT-Safe under Solaris
   #else
  if ((errnum < 0) || (errnum >= sys_nerr)) {
return "unknown error.";
  }
  return sys_errlist[errnum];
   #endif
   }

And in two other files where sys_errlist is referenced directly 
(NativeIO and hdfs_http_client.c), I replaced this direct access instead 
with a call to terror.


Thanks for all your help and patience,

I'll file a JIRA asap,

Cheers,
Malcolm

On 12/13/2014 05:26 PM, malcolm wrote:

Thanks Asokan,

Looked up Gcc's thread local variables, seems a bit complex though and 
quite specific to Gnu.


Intialization of the static errlist array should be thread safe i.e. 
initially the array is nulled out, and afterwards if two threads write 
to the same address, then they would be writing the same string.


But if we are ok with changing 5 files, not just terror, then I would 
just remove terror completely and use strerror_r (or the alternatives 
for Windows and HP_UX) in the caller code instead i.e. using your 
suggestion for a local buffer in the caller, wherever needed. The more 
I think about it, the more this seems to be the right thing to do.


Cheers,
Malcolm


On 12/13/2014 04:38 PM, Asokan, M wrote:

Malcom,
Gcc supports thread-local variables. See

https://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

I am not sure about native compilers on Solaris, HPUX, or AIX.

In any case, I found out that the Windows native code in Hadoop seems 
to handle error messages properly. Here is what I found:


$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep 
FormatMessage|awk -F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMappingWin.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/winutils/libwinutils.c 




$ find ~/work/hadoop/hadoop-trunk/ -name '*.c'|xargs grep terror|awk 
-F: '{print $1}'|sort -u
/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/exception.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/SharedFileDescriptorFactory.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocket.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/unix/DomainSocketWatcher.c 

/home/asokan/work/hadoop/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsMapping.c 




This means you need not worry about the Windows version of terror(). 
You need to change five files that contain UNIX specific native code.


I have a question on your suggested implementation:

How do you initialize the static errlist array in a thread-safe manner?


Here is another thread-safe implementation that I could come up with:

#include 
#include 
#include 
#include 

#define MESSAGE_BUFFER_SIZE 256

char * getSystemErrorMessage(char * buf, int buf_len, int code) {
#if defined(_HPUX_SOURCE)
   char * msg;
   errno = 0;
   msg = strerror(code);
   if (errno == 0) {
 strncpy(buf, msg, buf_len-1);
 buf[buf_len-1] = '\0';
   } else {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
   }
#else
   if (strerror_r(code, buf, buf_len) != 0) {
 snprintf(buf, buf_len, "%s %d",
 "Can't get system error message for code", code);
   }
#endif
   return buf;
}

#define TERROR(code) \
getSystemErrorMessage(messageBuffer, sizeof(messageBuffer), code)

int main(int argc, char ** argv) {
   if (argc > 1) {
 char messageBuffer[MESSAGE_BUFFER_SIZE];
 int code = atoi(argv[1]);

 fprintf(stderr, "System error for code %s: %s\n", argv[1], 
TERROR(code));

   }
   return 0;
}


This changes terror to a macro TERROR and requires all functions that 
call TERROR macro to declare the local variable messageBuffer. Since 
there are only five files to modify, I think it is not a big effort. 
What do you think?


-- Asokan

On 12/13/2014 04:29 AM, malcolm wrote:
Colin,

I am not sure what you mean by a thread-local buffer (in native 
code). In Java this is pretty standard, but I couldn't find any 
implementation for C code.


Here is the terror function:

 const char* terror(int errnum)
 {
   if ((errnum