I have created a set of examples of the amf malfunctions and amf crashes I have 
experienced, and as Praveen indicated, the message logs, /etc/opensaf 
configurations and trace logs for the 2 imm and 3 amf daemons are included for 
the crashes.

The README file in the examples directory tells the tale (actually, five 
different configurations of what is probably one bug).

I tried a workaround, putting all the added services into the payloads only, 
that didn't fix it.

It is a 7zip archive with ultra-compression, as small as I could squeeze it 
(25MB), it is located at the box.com storage site: 
https://app.box.com/s/t9ghgglv6cs0kaf10bnm0ial7be8h9zo

On yum systems, 7zip is installed by "sudo yum -y install p7zip-plugins.x86_64 
p7zip.x86_64"

FYI,  the command line I used for making the archive is "7z a -t7z -m0=lzma 
-mx=9 -mfb=64 -md=32m -ms=on examples.7z examples", in case that info is needed.

Charlie ...

-----Original Message-----
From: praveen malviya [mailto:[email protected]] 
Sent: Thursday, March 12, 2015 10:23 PM
To: Johnson, Charles; [email protected]
Subject: Re: [users] amf-adm question ...

On 11-Mar-15 4:56 AM, Johnson, Charles wrote:
> So, that works and it scales out, and I encountered yet another issue ...
>
> I scaled this out on ten Ethernet-connected native nodes, running 8 to 10 
> service processes per node in 2N groups and got it working by changing the 
> xml for the payload-only nodes using your changes.
>
> Thank you very much, Praveen !!
>
> Now, I was only able to get this working, by installing all the software on 
> every node (opensaf and my services) and bringing up the vanilla imm.xml for 
> opensaf native services, and then doing the one immcfg and four  amf-adm 
> commands for each service group (42 service groups) in serial order, from a 
> generated bash script file, after opensaf was up and running.
>
> Separately, although I was able to execute the immxml-merge command 
> successfully (using the --ignore-variants flag to get rid of the errors from 
> common objects in each 2N xml file for each service) and thus produce a valid 
> and totally inclusive imm.xml for opensaf + my services (in the exact same 
> object order that the commands from the bash file would add them at runtime), 
> when I tried to bring opensaf up (sudo service opensafd start) on the two 
> controller nodes, or just one of them, it either fails to start or crashes 
> the controller node. I checked, and all the merged imm.xml files are 
> identical on all nodes, and all the software is installed identically for 
> both the unmerged and merged cases.
>
> My thought is that OpenSAF cannot orchestrate the startup of the cluster with 
> that many services (outside of its own well-orchestrated startup sequence for 
> the native opensaf service taxonomy) and gets in a traffic jam internally, 
> hangs or crashes, but does not start.
>
> There appears to have been a service called SCAP sometime in the past, where 
> you would modify a file called NCSSystemBOM.xml to add your service to get 
> started when OpenSAF first comes up, but that seems not to be the case 
> anymore. Did that framework for startup disappear, or get replaced by 
> something else?
>
> Or is there some magic thing I need to do when I do immxml-merge 
> --ignore-variants, to allow OpenSAF to come up with that merged imm.xml 
> without hanging? If you could not do that magic thing, there would be no use 
> for immxml-merge, and it would probably not exist for long, so there must be 
> a magic thing, I reckon!
>
I have not used immxml-merge --ignore-variants any time.
Framework is in place. Any AMF modeled application will come up during cluster 
start  after expiry of cluster startup timer if all AMF model objects are 
proper Please share the error messages/syslog and also imm.xml if possible.

Thanks.
Praveen
> Charlie ...
>
> -----Original Message-----
> From: praveen malviya [mailto:[email protected]]
> Sent: Sunday, February 15, 2015 8:12 PM
> To: Johnson, Charles; [email protected]
> Subject: Re: [users] amf-adm question ...
>
>
>
> On 14-Feb-15 12:40 AM, Johnson, Charles wrote:
>>
>> Interesting. When I try to move the AmfDemo, and run it from PL-4 and PL-3, 
>> instead of SC-1 and SC2, but it fails to load.
>>
>> What I did was to change the AppConfig-2N.xml file doing those 
>> substitutions, the text is included below (it's not long.)
>>
>> The log states that it cannot find the script which is in the same place it 
>> was on all the nodes (/opt/amf_demo/amf_demo_script), or that it is corrupt, 
>> which it is not.
>>
>> Works fine in the controller nodes, not in the payload nodes: am I missing 
>> some limitation regarding Amf?
> Please configure below mentioned attribute in SU obejct to host it on a 
> desired a node. In the sample configuration this attribute is not configured.
> <attr>
>          <name>saAmfSUHostNodeOrNodeGroup</name>
>          <value>safAmfNode=SC-1,safAmfCluster=myAmfCluster</value>
> </attr>
>
> See below the configuration with changes.
>
> Thanks
> Praveen
>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to