[ https://issues.apache.org/jira/browse/TRAFODION-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379654#comment-16379654 ]
Gonzalo E Correa commented on TRAFODION-2883: --------------------------------------------- I have created the pull request which will need a code review before it can be merged. [https://github.com/apache/trafodion/pull/1457] These changes include the ability to run the monitor processes in AGENT mode from a Python installation plus several other scale related changes and bug fixes. To enable AGENT mode, uncomment the following environment variables in sqenvcom.sh and copy to all nodes. {panel:title=sqenvcom.sh } # Monitor process creator: # MPIRUN - monitor process is created by mpirun # Uncomment SQ_MON_CREATOR when running monitor in AGENT mode #export SQ_MON_CREATOR=MPIRUN # Monitor process run mode: # AGENT - monitor process runs in agent mode versus MPI collective # Uncomment the three environment variables below #export SQ_MON_RUN_MODE=AGENT #export MONITOR_COMM_PORT=23399 #export MONITOR_SYNC_PORT=2339 {panel} An alternative to the above is to add the following to sql/scripts/shell.env: SQ_MON_CREATOR=MPIRUN SQ_MON_RUN_MODE=AGENT MONITOR_COMM_PORT=23399 MONITOR_SYNC_PORT=23398 With regard to enabling monitor trace when in AGENT mode, use the file in sql/scripts/monitor.env and uncomment the trace level desired. Once this is merged to the baseline, I will merge up these changes to the shared TRAFODION-2884 branch in the zcorrea_fork > Preliminary Trafodion Foundation Scalability Enhancements > --------------------------------------------------------- > > Key: TRAFODION-2883 > URL: https://issues.apache.org/jira/browse/TRAFODION-2883 > Project: Apache Trafodion > Issue Type: Improvement > Components: dtm, foundation, installer > Affects Versions: 2.3 > Reporter: Gonzalo E Correa > Assignee: Gonzalo E Correa > Priority: Major > Fix For: 2.3 > > > Initial changes required to: > - AGENT mode monitor > o Preliminary change to remove dependency on OpenMPI during > initialization of operational cluster by creating a cluster > of one node (MASTER monitor) where other remote nodes (SLAVE > monitors) join the cluster through the MASTER > - MASTER monitor selection > - Scale bug fixes found when creating clusters greater than 120 nodes -- This message was sent by Atlassian JIRA (v7.6.3#76005)