Forgot to attach...

Brian Christiansen wrote:
There could be something more than what was pointed out and fixed earlier. Here is a patch with the changes that were checked in for this, or similar, issue. Check that your version has this applied. If this doesn't help try compiling as 32bit as Garrick suggested.

Brian

Wickliffe, Blake W wrote:
Right...but my suspicion is that we are facing something else, since Brian 
claims the issue with the 32/64bit was fixed in 3.2.6p21, which we are already 
on.

Unless I am misunderstanding you?

Blake Wickliffe
Saudi Aramco
ENOD/CSYS/USG HPC Team
(873-4417)


-----Original Message-----
From: Garrick [mailto:[email protected]]
Sent: Tuesday, August 18, 2009 7:35 AM
To: Wickliffe, Blake W
Cc: Maui Users
Subject: Re: [Mauiusers] Corrupt node feature list

That's fine. 32bit maui build works fine on 64bit host talking to a
64bit pbs_server.

HPCC/Linux Systems Admin

On Aug 17, 2009, at 9:23 PM, "Wickliffe, Blake W" <[email protected]
 > wrote:

Unfortunately, we are already using 3.2.6p21, and it is on a 64-bit
system.  So, if that's the case, even reverting back to 32-bit might
not work.

Blake Wickliffe
Saudi Aramco
ENOD/CSYS/USG HPC Team
(873-4417)


-----Original Message-----
From: [email protected] [mailto:mauiusers-
[email protected]] On Behalf Of Brian Christiansen
Sent: Monday, August 17, 2009 9:27 PM
To: Maui Users
Subject: Re: [Mauiusers] Corrupt node feature list

There was an issue, previously, where you could only have 32 node
features on a 64bit system without seeing side effects. If you aren't
using the latest snapshot, you could try it and see if it helps.

From the changelog:
Maui 3.2.6p21
- Fixed 64bit issue. Maui assumed ints were always 8 bytes for 64bit
systems even though x86_64 ints are still 4 bytes. This lead to
aliasing
of large indexed node properties to smaller indexed properties. Maui
now
triggers off of sizeof(int). Thanks goes to Alexis Cousein.

Brian Christiansen

Garrick Staples wrote:
Please start new threads with the "new" button in your email
client, not with the "reply" button.

On Mon, Aug 17, 2009 at 04:04:21PM +0300, Wickliffe, Blake W alleged:

Hi all,
Has anyone experienced a problem with Maui corrupting the features
list of nodes after a certain number of nodes are added?

On our cluster, we have 2336 nodes, most of which have only 1
"Feature" or "Property" in the Torque parlance.  However,
immediately upon adding another node, we start seeing things like:

Features:   [[NONE]][checki][datai]

When doing a "checknode" on various nodes.  The problem only gets
worse and more extensive as further nodes are added.  Deleting the
nodes from the qmgr brings everything back to normal.

Any ideas?

Yes, I've seen this with 64bit builds.  Build maui 32bit and it
won't happen.


---
---------------------------------------------------------------------

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

The contents of this email, including all related responses, files
and attachments transmitted with it (collectively referred to as
"this Email"), are intended solely for the use of the individual/
entity to whom/which they are addressed, and may contain
confidential and/or legally privileged information. This Email may
not be disclosed or forwarded to anyone else without authorization
from the originator of this Email. If you have received this Email
in error, please notify the sender immediately and delete all copies
from your system. Please note that the views or opinions presented
in this Email are those of the author and may not necessarily
represent those of Saudi Aramco. The recipient should check this
Email and any attachments for the presence of any viruses. Saudi
Aramco accepts no liability for any damage caused by any virus/error
transmitted by this Email.
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers
The contents of this email, including all related responses, files and attachments 
transmitted with it (collectively referred to as "this Email"), are intended 
solely for the use of the individual/entity to whom/which they are addressed, and may 
contain confidential and/or legally privileged information. This Email may not be 
disclosed or forwarded to anyone else without authorization from the originator of this 
Email. If you have received this Email in error, please notify the sender immediately and 
delete all copies from your system. Please note that the views or opinions presented in 
this Email are those of the author and may not necessarily represent those of Saudi 
Aramco. The recipient should check this Email and any attachments for the presence of any 
viruses. Saudi Aramco accepts no liability for any damage caused by any virus/error 
transmitted by this Email.
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Index: include/moab.h
===================================================================
--- include/moab.h	(revision 107)
+++ include/moab.h	(revision 108)
@@ -125,6 +125,7 @@
 #define M32UINT4     unsigned long
 #define M32UINT8     unsigned long long
 
+/* ints on x86_64 are still 4 bytes */
 #ifdef __M64
 #define MINTBITS   64
 #define MINTLBITS  6
Index: include/moab-proto.h
===================================================================
--- include/moab-proto.h	(revision 107)
+++ include/moab-proto.h	(revision 108)
@@ -453,6 +453,7 @@
 int MSysDestroyObjects(void);
 int MSysDiagnose(char *,int,long);
 int MSysStartServer(int);
+int M64Init(m64_t *);
 
 
 
Index: CHANGELOG
===================================================================
--- CHANGELOG	(revision 107)
+++ CHANGELOG	(revision 108)
@@ -1,5 +1,7 @@
 Maui 3.2.6p21
   - Fixed CHECKSUM authentication for maui + slurm. Thanks goes to Eyegene Ryabinkin.
+  - Fixed 64bit issue. Maui assumed ints were always 8 bytes for 64bit systems even though x86_64 ints are still 4 bytes. This lead to aliasing of large indexed node properties to smaller indexed properties. Maui now triggers off of sizeof(int). Thanks goes to Alexis Cousein.
+  - Fixed an optimiztion issue with x86_64 systems. -O2 was optimizing out parts of the communication strings.
 
 Maui 3.2.6p20
   - Fixed a potential security issue when Maui is used with some PBS configurations.
Index: src/server/OUserI.c
===================================================================
--- src/server/OUserI.c	(revision 107)
+++ src/server/OUserI.c	(revision 108)
@@ -33,6 +33,8 @@
 
   long      tmpL;
 
+  char     tmpLine[MMAX_LINE];
+
   const char *FName = "UIProcessCommand";
 
   DBG(3,fUI) DPrint("%s(S)\n",
@@ -411,16 +413,16 @@
 
       S->SBufSize = (long)sizeof(SBuffer);
 
-      sprintf(S->SBuffer,"%s%d ",
-	MCKeyword[mckStatusCode],
-	scFAILURE);
+      sprintf(tmpLine,"%s%d ",
+        MCKeyword[mckStatusCode],
+        scFAILURE);
 
-      Align = (int)strlen(S->SBuffer) + (int)strlen(MCKeyword[mckArgs]);
+      Align = (int)strlen(tmpLine) + (int)strlen(MCKeyword[mckArgs]);
 
       sprintf(S->SBuffer,"%s%*s%s",
-        S->SBuffer,
-	16 - (Align % 16), 
-	" ",
+        tmpLine,
+	      16 - (Align % 16), 
+	      " ",
         MCKeyword[mckArgs]);
 
       HeadSize = (int)strlen(SBuffer);
@@ -429,7 +431,7 @@
       if (Function[sindex] != NULL)
         scode = (*Function[sindex])(args,S->SBuffer + HeadSize,FLAGS,Auth,&S->SBufSize);
       else
-	scode = FAILURE;
+        scode = FAILURE;
 
       ptr = S->SBuffer + strlen(MCKeyword[mckStatusCode]);
 
Index: src/server/mclient.c
===================================================================
--- src/server/mclient.c	(revision 107)
+++ src/server/mclient.c	(revision 108)
@@ -10,6 +10,7 @@
 #define MAX_MCARGS  128
 
 extern mattrlist_t MAList;
+extern m64_t       M64;
 
 int MCResCreate(char *);
 int MCJobShow(char *);
@@ -563,6 +564,8 @@
   DBG(2,fALL) DPrint("%s()\n",
     FName);
 
+  M64Init(&M64);
+
   MUBuildPList(MCfg,MParam);
  
   strcpy(C.ServerHost,DEFAULT_MSERVERHOST);
Index: src/mcom/MSU.c
===================================================================
--- src/mcom/MSU.c	(revision 107)
+++ src/mcom/MSU.c	(revision 108)
@@ -1303,9 +1303,11 @@
 
       if (DoSocketLayerAuth == TRUE)
         {
+        char tmpStr[MMAX_BUFFER];
+
         time(&Now);
 
-        sprintf(TSLine,"%s%ld %s%s",
+        sprintf(tmpStr,"%s%ld %s%s",
           MCKeyword[mckTimeStamp],
           (long)Now,
           MCKeyword[mckAuth],
@@ -1320,7 +1322,7 @@
           }
 
         sprintf(TSLine,"%s %s",
-          TSLine,
+          tmpStr,
           MCKeyword[mckData]);
         
         MSecGetChecksum2(
Index: src/mcom/MSec.c
===================================================================
--- src/mcom/MSec.c	(revision 107)
+++ src/mcom/MSec.c	(revision 108)
@@ -130,7 +130,6 @@
 
 
 
-#ifndef __M32COMPAT
 
 int M64Init(
 
@@ -143,10 +142,10 @@
 
     M->Is64     = FALSE;
 
-    M->INTBC    = M32INTBITS;
-    M->INTLBC   = M32INTLBITS;
-    M->MIntSize = M32INTSIZE;
-    M->IntShift = M32INTSHIFT;
+    M->INTBITS  = M32INTBITS;
+    M->INTLBITS = M32INTLBITS;
+    M->INTSIZE  = M32INTSIZE;
+    M->INTSHIFT = M32INTSHIFT;
     }
   else
     {
@@ -154,10 +153,10 @@
 
     M->Is64     = TRUE;
 
-    M->INTBC    = M64INTBITS;
-    M->INTLBC   = M64INTLBITS;
-    M->MIntSize = M64INTSIZE;
-    M->IntShift = M64INTSHIFT;
+    M->INTBITS  = M64INTBITS;
+    M->INTLBITS = M64INTLBITS;
+    M->INTSIZE  = M64INTSIZE;
+    M->INTSHIFT = M64INTSHIFT;
     }
 
   MDB(5,fSTRUCT) MLog("INFO:     64Bit enabled: %s  UINT4[%d]  UINT8[%d]\n",
@@ -168,7 +167,6 @@
   return(SUCCESS);
   }  /* END M64Init() */
 
-#endif /* !__M32COMPAT */
 
 
 
Index: src/moab/MPar.c
===================================================================
--- src/moab/MPar.c	(revision 107)
+++ src/moab/MPar.c	(revision 108)
@@ -23,6 +23,7 @@
 extern mrm_t        MRM[];
 extern mstat_t      MStat;
 extern mattrlist_t  MAList;
+extern m64_t        M64;
  
 extern const char *MQALType[];
 extern const char *MResourceType[];
@@ -1252,7 +1253,7 @@
     {
     P = &MPar[pindex];    
 
-    if (!(BM[pindex >> MINTLBITS] & (1 << (pindex % MINTBITS))))
+    if (!(BM[pindex >> M64.INTLBITS] & (1 << (pindex % M64.INTBITS))))
       continue;
 
     if (P->Name[0] == '\0')
Index: src/moab/MTrace.c
===================================================================
--- src/moab/MTrace.c	(revision 107)
+++ src/moab/MTrace.c	(revision 108)
@@ -25,6 +25,7 @@
 extern mqos_t      MQOS[];
 extern mpar_t      MPar[];
 extern mrm_t       MRM[];
+extern m64_t       M64;
 
 extern mframe_t    MFrame[];
 
@@ -1219,7 +1220,7 @@
 
       J->SpecFlags |= MSim.TraceDefaultJobFlags;
 
-      for (index = 0;index < MINTBITS;index++)
+      for (index = 0;index < M64.INTBITS;index++)
         {
         if (!(MSim.TraceIgnoreJobFlags & (1 << index)))
           continue;
Index: src/moab/MQOS.c
===================================================================
--- src/moab/MQOS.c	(revision 107)
+++ src/moab/MQOS.c	(revision 108)
@@ -17,6 +17,7 @@
 extern mgcred_t    MGroup[];
 extern mgcred_t    MAcct[];
 extern mclass_t    MClass[];
+extern m64_t       M64;
 
 extern const char *MQOSFlags[];
 extern const char *MQALType[];
@@ -896,7 +897,7 @@
 
   for (bindex = 0;bindex < MAX_MQOS;bindex++)
     {
-    if (!(BM[bindex >> MINTLBITS] & (1 << (bindex % MINTBITS))))
+    if (!(BM[bindex >> M64.INTLBITS] & (1 << (bindex % M64.INTBITS))))
       continue;
 
     Q = &MQOS[bindex];
Index: src/moab/MSys.c
===================================================================
--- src/moab/MSys.c	(revision 107)
+++ src/moab/MSys.c	(revision 108)
@@ -38,6 +38,7 @@
 mrmfunc_t MRMFunc[MAX_MRMTYPE];
 msim_t    MSim;
 msys_t    MSys;                   /* cluster layout */
+m64_t     M64;
 
 mx_t      X;
 int       MFQ[MAX_MJOB];          /* terminated by '-1' value      */
@@ -98,6 +99,8 @@
 
   S->X    = (void *)&X;
 
+  M64Init(&M64);
+
   MOSSyslogInit(S);
 
   MUBuildPList((mcfg_t *)MCfg,MParam);
Index: src/moab/MSched.c
===================================================================
--- src/moab/MSched.c	(revision 107)
+++ src/moab/MSched.c	(revision 108)
@@ -19,6 +19,7 @@
 extern mframe_t    MFrame[];
 extern mckpt_t     MCP;
 extern mres_t     *MRes[];
+extern m64_t       M64;
 
 extern int         MAQ[];
 extern int         MUIQ[];
@@ -2256,8 +2257,8 @@
 
         for (sindex = 0;sindex < MaxSet;sindex++)
           {
-          if (N->FBM[SetIndex[sindex] >> MINTLBITS] & 
-             (1 << (SetIndex[sindex] % MINTBITS)))
+          if (N->FBM[SetIndex[sindex] >> M64.INTLBITS] & 
+             (1 << (SetIndex[sindex] % M64.INTBITS)))
             {
             SetCount[sindex] += TC;
             SetNC[sindex] ++;
@@ -2422,8 +2423,8 @@
             {
             case mrstFeature:
 
-              if (N->FBM[SetIndex[sindex] >> MINTLBITS] & 
-                 (1 << (SetIndex[sindex] % MINTBITS)))
+              if (N->FBM[SetIndex[sindex] >> M64.INTLBITS] & 
+                 (1 << (SetIndex[sindex] % M64.INTBITS)))
                 {  
                 /* node is feasible */
 
@@ -2576,8 +2577,8 @@
           {
           case mrstFeature:
  
-            if (N->FBM[SetIndex[BestSet] >> MINTLBITS] & 
-               (1 << (SetIndex[BestSet] % MINTBITS)))
+            if (N->FBM[SetIndex[BestSet] >> M64.INTLBITS] & 
+               (1 << (SetIndex[BestSet] % M64.INTBITS)))
               {
               /* node is in set */
  
Index: src/moab/MUtil.c
===================================================================
--- src/moab/MUtil.c	(revision 107)
+++ src/moab/MUtil.c	(revision 108)
@@ -23,6 +23,7 @@
 extern const char *MNodeState[];
 extern const char *MHRObj[];
 extern const char *MResourceType[];
+extern m64_t       M64;
 
 extern mx_t      X;
 
@@ -788,7 +789,7 @@
     return(SUCCESS);
     }
 
-  if ((AttrValue == NULL) || (MapSize < MINTSIZE))
+  if ((AttrValue == NULL) || (MapSize < M64.INTSIZE))
     {
     return(FAILURE);
     }
@@ -805,7 +806,7 @@
     if (!strcmp(MAList[AttrIndex][index],AttrValue))
       {
       if (AttrMap != NULL)
-        AttrMap[index >> MINTLBITS] |= 1 << (index % MINTBITS);
+        AttrMap[index >> M64.INTLBITS] |= 1 << (index % M64.INTBITS);
 
       return(SUCCESS);
       }
@@ -822,7 +823,7 @@
 
     MUStrCpy(MAList[AttrIndex][index],AttrValue,sizeof(MAList[0][0]));
 
-    AttrMap[index >> MINTLBITS] |= 1 << (index % MINTBITS);
+    AttrMap[index >> M64.INTLBITS] |= 1 << (index % M64.INTBITS);
 
     DBG(5,fSTRUCT) DPrint("INFO:     added MAList[%s][%d]: '%s'\n",
       MAttrType[AttrIndex],
@@ -1069,7 +1070,7 @@
 
   Line[0] = '\0';
 
-  for (i = 1;i < MINTBITS;i++)
+  for (i = 1;i < M64.INTBITS;i++)
     {
     if ((Value & (1 << i)) && (MAList[Attr][i][0] != '\0'))
       {
@@ -1097,7 +1098,7 @@
   int         index;
   int         findex;
 
-  if ((ValueMap == NULL) || (MapSize < MINTSIZE))
+  if ((ValueMap == NULL) || (MapSize < M64.INTSIZE))
     {
     strcpy(Line,NONE);
 
@@ -1106,16 +1107,16 @@
 
   Line[0] = '\0';
 
-  for (findex = 0;findex < (MapSize >> MINTSHIFT);findex++)
+  for (findex = 0;findex < (MapSize >> M64.INTSHIFT);findex++)
     {
-    for (index = 0;index < MINTBITS;index++)
+    for (index = 0;index < M64.INTBITS;index++)
       {
       if ((ValueMap[findex] & (1 << index)) && 
           (MAList[AttrIndex][index][0] != '\0'))
         {
         sprintf(Line,"%s[%s]",
           Line,
-          MAList[AttrIndex][index + findex * MINTBITS]);
+          MAList[AttrIndex][index + findex * M64.INTBITS]);
         }
       }    /* END for (index) */
     }      /* END for (findex) */
@@ -1152,7 +1153,7 @@
     return(NULL);
     }
  
-  if ((ValueMap == NULL) || (MapSize < MINTSIZE))
+  if ((ValueMap == NULL) || (MapSize < M64.INTSIZE))
     {
     return(NULL);
     }
@@ -1162,7 +1163,7 @@
  
   for (findex = 0;findex < (MapSize >> 2);findex++)
     {
-    for (index = 0;index < MINTBITS;index++)
+    for (index = 0;index < M64.INTBITS;index++)
       {
       if ((ValueMap[findex] & (1 << index)) &&
           (MAList[AttrIndex][index][0] != '\0'))
@@ -1217,7 +1218,7 @@
     return(Line);
     }
 
-  for (i = 1;i < MINTBITS;i++)
+  for (i = 1;i < M64.INTBITS;i++)
     {
     if ((Value & (1 << i)) && (MAList[Attr][i][0] != '\0'))
       {
@@ -4121,7 +4122,7 @@
 
   ptr[0] = '\0';
 
-  for (i = 1;i < MINTBITS;i++)
+  for (i = 1;i < M64.INTBITS;i++)
     {
     if ((BM & (1 << i)) && (AList[i] != NULL) && (AList[i][0] != '\0'))
       {
@@ -4252,7 +4253,7 @@
   int mindex;
   int len;
 
-  len = MAX(1,(MapSize >> MINTLBITS));
+  len = MAX(1,(MapSize >> M64.INTLBITS));
 
   for (mindex = 0;mindex < len;mindex++)
     {
@@ -4275,7 +4276,7 @@
   int mindex;
   int len;
 
-  len = MAX(1,(MapSize >> MINTLBITS));
+  len = MAX(1,(MapSize >> M64.INTLBITS));
  
   for (mindex = 0;mindex < len;mindex++)
     {
@@ -5413,7 +5414,7 @@
 
   char        *ptr;
 
-  if ((ValueMap == NULL) || (MapSize < MINTSIZE))
+  if ((ValueMap == NULL) || (MapSize < M64.INTSIZE))
     {
     strcpy(Line,NONE);
 
@@ -5422,14 +5423,14 @@
 
   Line[0] = '\0';
 
-  for (findex = 0;findex < (MapSize >> MINTSHIFT);findex++)
+  for (findex = 0;findex < (MapSize >> M64.INTSHIFT);findex++)
     {
-    for (index = 0;index < MINTBITS;index++)
+    for (index = 0;index < M64.INTBITS;index++)
       {
       if ((ValueMap[findex] & (1 << index)) &&
           (MAList[AttrIndex][index][0] != '\0'))
         {
-        ptr = MAList[AttrIndex][index + findex * MINTBITS];
+        ptr = MAList[AttrIndex][index + findex * M64.INTBITS];
 
         if (Delim != '\0')
           {
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to