date:20181120

[ClusterLabs] network fencing - azure arm

2018-11-20 Thread Thomas Berreis

Hello,

 

I have a question about stonith configuration within azure and I hope I'm
using the correct mailing list.

 

I've installed two virtual machines with pacemaker 1.1.18, pcs 0.9.162 and
fence-agents-azure-arm 4.0.11.86.

Now I'm unable to create a stonith configuration via pcs by using
network-fencing. Without network-fencing everything works fine.

 

The following command is working as expected (without network-fencing):

# pcs stonith create stonith.node1 fence_azure_arm login=$az_login
passwd=$az_passwd resourceGroup=$az_rg subscriptionId=$az_sid
tenantId=$az_tenant retry_on=0 pcmk_host_list=node1

 

If I add "network-fencing" to the list, pcs throws an error:

Error: missing value of 'network-fencing' option

 

I don't know what's wrong because network-fencing doesn't require any value.

# fence_azure_arm -h | grep -A2 network-fencing

   --network-fencing   Use network fencing. See NOTE-section of

   metadata for required Subnet/Network Security

   Group configuration.

 

Also network-fencing="on", "true" or "0" wasn't working. If I use
fence_azure_arm with network-fencing option manually everything is working
as expected.

Unfortunately this fence agent is very rare documented and I didn't found
any example for network-fencing.

 

Thanks for your help!

 

Best Regards,

Thomas B.

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Announcing Anvil! m2 v2.0.7

2018-11-20 Thread Digimer

On 2018-11-20 2:53 a.m., Ulrich Windl wrote:
> Hi!
> 
> You forgot the most important piece of information: "What is it?" I guess it's
> so obvious for you that you forgot to mention. ;-)
> 
> Regards,
> Ulrich

Heh, fair enough :)

We have a specialized, "canned" HA cluster that adds a lot of autonomous
operation, particularly useful for deployments where there aren't
cluster specialists available (factories, cargo ships, etc).

We wrote a rough overview of it here, if you're curious;

https://www.alteeve.com/w/What_is_an_Anvil!_and_why_do_I_care%3F

Basically, if you want to host VMs but won't be there to take care of
them often, you might find the Anvil! platform quite appealing.

cheers!

 Digimer  schrieb am 20.11.2018 um 08:25 in Nachricht
> <3ff31468-4052-dda7-7841-4c04985ad...@alteeve.ca>:
>> * https://github.com/ClusterLabs/striker/releases/tag/v2.0.7 
>>
>> This is the first release since March, 2018. No critical issues are know
>> or where fixed. Users are advised to upgrade.
>>
>> Main bugs fixed;
>>
>> * Fixed install issues for Windows 10 and 2016 clients.
>> * Improved duplicate record detection and cleanup in scan-clustat and
>> scan-storcli.
>> * Disabled the detection and recovery of 'paused' state servers (it
>> caused more trouble than it solved).
>>
>> Notable new features;
>> * Improved the server boot logic to choose the node with the most
>> running servers, all else being equal.
>> * Updated UPS power transfer reason alerts from "warning" to "notice"
>> level alerts.
>> * Added support for EL 6.10.
>>
>> Users can upgrade using 'striker-update' from their Striker dashboards.
>>
>> /sbin/striker/striker-update --local
>> /sbin/striker/striker-update --anvil all
>>
>> Please feel free to report any issues in the Striker github repository.
>>
>> -- 
>> Digimer
>> Papers and Projects: https://alteeve.com/w/ 
>> "I am, somehow, less interested in the weight and convolutions of
>> Einstein’s brain than in the near certainty that people of equal talent
>> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] FYI: dlm.service possibly silenty overwritten with DisplayLink driver installer

2018-11-20 Thread Jan Pokorný

Accidentally, when searching for something systemd related, dlm.service
caught my eye, and surprisingly, it was rather in a HW support in Linux
SW enablement context.  Briefly looking into the Ubuntu driver that
allegedly contained that file (or recipe to create it, actually), I've
realized the respective installer would simply overwrite the regular
cluster DLM related service file without any hesitations
(unless I miss something).

So take this as a heads-up to watch out for that circumstance, three
letter acronyms are apparently not very namespace-collision-proof.

I logged this with their "Feature Suggestion" asking for more
carefulness:
https://support.displaylink.com/forums/287786-displaylink-feature-suggestions/suggestions/36068896

-- 
Jan (Poki)


pgppW1YFZ_3ut.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker failed to restart subprocess of host if container also uses pacemaker cluster!

2018-11-20 Thread Ken Gaillot

On Fri, 2018-11-16 at 16:33 +0800, ma.jinf...@zte.com.cn wrote:
> There is a problem in my program about pacemake that pacemaker
>  failed to restart subprocess of host  if container also uses
> pacemaker cluster!

That might not be supportable with the current code. It's possible to
have a nested cluster with VMs, but containers probably share too much
of the host environment. There was an issue not that long ago with
libqb that led to a new libqb option to use filesystem sockets instead
of Linux native sockets, that might help, but I wouldn't be surprised
if there are more issues.

One problem with nested clusters is fencing; it's difficult to get
fencing working reliably in both clusters.

If the reason for the separation is policy, then VMs may be the only
way. Otherwise, if you just want to control resources inside the
containers, then the new bundles feature or the Pacemaker Remote
feature would be the best way to handle it.

> The environment is as follows:
> 1. corosync version  2.4.0    pacemaker version 1.1.16
> 2. three node clusters, and container also has a pacemaker cluster
> This issue caused the cluster can`t work normally when the node is
> restart or the pacemakerd process is restart . 
> I did a test for it: stop corosync ( leading to pacemaker restart)
>  ,the logs are as follows:
> ///stop corosync//
> [ubuntu@paas-controller-208-1-0-40:~]$ sudo su
> [root@paas-controller-208-1-0-40:/home/ubuntu]$ service  corosync
> stop
> [root@paas-controller-208-1-0-40:/home/ubuntu]$ ps -elf | grep
> pacemaker
> 4 S root 16613 14434 0 80 0 - 26569 poll_s 19:09 pts/2 00:00:00
> /usr/sbin/pacemakerd
> 4 S haclust+ 16619 16613 0 80 0 - 27481 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/cib
> 4 S root 16620 16613 0 80 0 - 27454 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/stonithd
> 4 S root 16622 16613 0 80 0 - 19155 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/lrmd
> 4 S haclust+ 16623 16613 0 80 0 - 25141 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/attrd
> 4 S haclust+ 16624 16613 0 80 0 - 20618 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/pengine
> 4 S haclust+ 16625 16613 0 80 0 - 29743 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/crmd
> 4 S root 16628 14465 0 80 0 - 26569 poll_s 19:09 pts/3 00:00:00
> /usr/sbin/pacemakerd
> 4 S haclust+ 16631 16628 0 80 0 - 27357 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/cib
> 4 S root 16632 16628 0 80 0 - 27455 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/stonithd
> 4 S root 16633 16628 0 80 0 - 19155 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/lrmd
> 4 S haclust+ 16634 16628 0 80 0 - 25142 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/attrd
> 4 S haclust+ 16635 16628 0 80 0 - 20618 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/pengine
> 4 S haclust+ 16636 16628 0 80 0 - 29743 poll_s 19:09 ? 00:00:00
> /usr/libexec/pacemaker/crmd
> 4 S root 23559 1 0 80 0 - 20416 hrtime 19:10 ? 00:00:00
> /usr/sbin/pacemakerd -f
> 4 S root 25105 11245 0 80 0 - 28203 pipe_w 19:10 pts/5 00:00:00 grep
> --color=auto pacemaker
> 4 S root 31529 1 0 80 0 - 19012 poll_s 14:41 ? 00:00:40
> /usr/libexec/pacemaker/lrmd
> 4 S haclust+ 31531 1 0 80 0 - 24467 poll_s 14:41 ? 00:00:29
> /usr/libexec/pacemaker/pengine
> 
> some pacemaker process(crmd,attrd,cib,stonithd) seems to be lost,
> even if I restart pacemaker(service pacemaker start) .
> Does anyone know how to deal it? Thank you very much! 
> 
> 
> 
> 
> 
> 马金峰
> 通信协议软件开发工程师 
> 虚拟化二部/无线研究院/无线产品经营部 NIV Dept. II/Wireless Product R＆D
> Institute/Wireless Product Operation Division
>  
> 中兴通讯股份有限公司
> 上海市浦东新区碧波路889号中兴通讯D2070
> T: +86 021       M: +86 17601320963
> E: ma.jinf...@zte.com.cn
> www.zte.com.cn
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] pcs constraint order set syntax

2018-11-20 Thread Ken Gaillot

On Mon, 2018-11-19 at 14:32 -0800, Chris Miller wrote:
> 
> Hello,
>     I am attempting to add a resource to an existing ordering
> constraint set. The system in question came pre-configured
> with   PCS (FreePBX HA), and need to add a resource group
> (queuemetrics) to the ordering constraint set. Before modification,
> the existing set is as follows (output from pcs config --full)
> set mysql httpd asterisk sequential=true (id:mysql-httpd-asterisk)
> setoptions kind=Optional (id:freepbx-start-order)
> I'm having issues with the "resource order set" command syntax,
> specifically with setting options and IDs. Per the man page and help
> info, the syntax appears as it should be this :
> 
> pcs constraint order set mysql httpd asterisk queuemetrics
> sequential=true id=mysql-httpd-asterisk setoptions kind=Optional
> id=freepbx-start-order
> 
> However when running this command I receive the following error :
> Call cib_replace failed (-203): Update does not conform to the
> configured schema
> I have also tried variations of this syntax, and the ID option
> specifically is ignored and a dynamically generated name is used
> instead.
> 
> I'm not having any luck finding guidance with this specific issue
> online. Thanks in advance for your guidance.
> 
> Chris

I'm guessing the issue is that the set already exists; "order set"
creates a new one. I don't think there is a single command to modify an
existing set. You could delete the existing one then add the new one
(preferably using -f with an external file to make the changes atomic).
Or, you could use "pcs cluster edit" and modify the XML interactively.
-- 
Ken Gaillot 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Corosync 3.0 - Alpha5 is available at corosync.org!

2018-11-20 Thread Jan Friesse

I am pleased to announce the fifth testing (Alpha 5) release of Corosync 
3.0 (codename Camelback) available immediately from our website at

http://build.clusterlabs.org/corosync/releases/ as corosync-2.99.4.

You can also download RPMs for various distributions from CI 
https://kronosnet.org/builds/.


This release turns out to be quite bigger than I've expected, so that's 
the reason why it's not Beta/RC yet. It also contains quite a lot of 
removal of unused/unmaintained code.


List of biggest changes and backwards compatibility breakages with 
reasoning and replacement (if needed):


- Removal of CTS - unmaintained for long time and unused by developers, 
currently without replacement.


- libtotem is no longer shared library and it's directly compiled into 
corosync binary - main idea of having libtotem.so was to allow other 
projects usage of libtotem and build custom "corosync" on top of it. 
This idea actually never got expected usage and because it was super 
big, it was just making corosync development much harder. This doesn't 
affect any corosync user. In future totemsrp.c should be made into real 
well testable library without network protocol handling, ... but right 
now, there is no replacement for libtotem.so.


- Remove usage of all environment variables and tied up corosync arguments:
  - -p, -P, -R and -r options - replaced by system.sched_rr, 
system.priority and system.move_to_root_cgroup options in the config file.
  - env COROSYNC_MAIN_CONFIG_FILE - Replaced by "-c" option. This also 
affects uidgid.d location.
  - env COROSYNC_TOTEM_AUTHKEY_FILE - Replaced by (already existing) 
totem.keyfile config option, which is now documented.

  - env COROSYNC_RUN_DIR - Replaced by system.run_dir and documented.

- Usage of libcgroup removal - deprecated in most of new distributions, 
replaced by "short" code with even functionality.


- NSS dependency removal - not needed anymore because crypto is now 
handled by knet, so no replacement needed. This change affects only 
cpgverify where packet format is changed.


- Corosync config file parser updated so it's now more strict and 
(finally) displays line with error. Affects only broken config files.


- With new enough LibQB, it's possible to send command to corosync and 
it tries reopen its log files. Used by logrotate in favor of old 
copytruncate method. Copytruncate still exists and is compiled/installed 
by default with old LibQB.


- Timestamps are now enabled by default. With new enough LibQB hires 
(including milliseconds) timestamps are used by default.



Complete changelog for Alpha 5 (compared to Alpha 4):
Chris Walker (3):
  Add option for quiet operation to corosync-cmapctl
  Add token_warning configuration option
  Add option to force cluster into GATHER state

Christine Caulfield (2):
  config: Fix crash in reload if new interfaces are added
  config: Allow generated nodeis for UDP & UDPU

Ferenc Wágner (3):
  man: fix cmap key name runtime.config.totem.token
  man: Fix typo connnections -> connections
  man: Fix typo conains -> contains

Jan Friesse (40):
  spec: Add explicit gcc build requirement
  totemknet: Free instance on failure exit
  util: Fix strncpy in setcs_name_t function
  cmap: Fix strncpy warning in cmap_iter_next
  ipc_glue: Fix strncpy in pid_to_name function
  totemconfig: Enlarge error_string_response
  totemsrp: Add assert into memb_lowest_in_config
  corosync-notifyd: Rename global local_nodeid
  Remove libcgroup
  build: Support for git archive stored tags
  git-version-gen: Fail on UNKNOWN version
  notifyd: Propagate error to exit code
  coroparse: Return error if config line is too long
  coroparse: Check icmap_set results
  coroparse: Fix remove_whitespace end condition
  coroparse: Be more strict in what is parsed
  coroparse: Add file name and line to error message
  coroparse: Use key_name for error message
  coroparse: Fix newly introduced warning
  man: Fix crypto_hash and crypto_cipher defaults
  cts: Remove CTS
  build: Remove NSS dependencies
  build: Do not compile totempg as a shared library
  build: Remove totempg shared library leftovers
  man: Fix default knet_pmtud_interval to match code
  totemconfig: Replace strcpy by strncpy
  log: Implement support for reopening log files
  config example: Migrate to newer syntax
  totemconfig: Fix logging of freed string
  logsys: Support hires timestamp
  logsys: Make hires timestamp default
  configure: move to AC_COMPILE_IFELSE
  main: Move sched paramaters to config file
  main: Replace COROSYNC_MAIN_CONFIG_FILE
  main: Remove COROSYNC_TOTEM_AUTHKEY_FILE
  man: Describe nodelist.node.name properly
  main: Remove COROSYNC_RUN_DIR
  init: Fix init script to work with containers
  stats: Fix delete of track
  notifyd: Delete registered tracking keys

Jan

Re: [ClusterLabs] Antw: VirtualDomain & parallel shutdown

2018-11-20 Thread Klechomir

Hi Ulrich,
The stop timeout needs te quite big for an obvius reason called "System is 
updating, do not turn it off".

The main question here is why this slow shutdown would prevent the other VMs 
from being shut down?

Regards,

On 20.11.2018 11:54:58 Ulrich Windl wrote:
> >>> Klechomir  schrieb am 20.11.2018 um 11:40 in Nachricht
> 
> <12860117.ByXx81i3mo@bobo>:
> > Hi list,
> > Bumped onto the following issue lately:
> > 
> > When ultiple VMs are given shutdown right one‑after‑onther and the
> > shutdown
> 
> of
> 
> > the first VM takes long, the others aren't being shut down at all before
> > the
> > 
> > 
> > first doesn't stop.
> 
> I don't quite understand: WHen the stop timeout for a VM expired, the
> cluster takes measures, or in the Xen PV case the VM is terminated the hard
> way.
> > "batch‑limit" doesn't seem to affect this.
> > Any suggestions why this could happen?
> 
> I know of a market-leader software that needs more then five minutes to shut
> down when it's doing nothing before and during the shutdown (no I/O, no CPU
> usage)... ;-)
> Meaning: Software bugs?
> 
> Regards,
> Ulrich
> 
> > Best regards,
> > Klecho
> > ___
> > Users mailing list: Users@clusterlabs.org
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Antw: VirtualDomain & parallel shutdown

2018-11-20 Thread Ulrich Windl

>>> Klechomir  schrieb am 20.11.2018 um 11:40 in Nachricht
<12860117.ByXx81i3mo@bobo>:
> Hi list,
> Bumped onto the following issue lately:
> 
> When ultiple VMs are given shutdown right one‑after‑onther and the shutdown
of 
> 
> the first VM takes long, the others aren't being shut down at all before the

> 
> first doesn't stop.

I don't quite understand: WHen the stop timeout for a VM expired, the cluster
takes measures, or in the Xen PV case the VM is terminated the hard way.


> 
> "batch‑limit" doesn't seem to affect this.
> Any suggestions why this could happen?

I know of a market-leader software that needs more then five minutes to shut
down when it's doing nothing before and during the shutdown (no I/O, no CPU
usage)... ;-)
Meaning: Software bugs?

Regards,
Ulrich


> 
> Best regards,
> Klecho
> ___
> Users mailing list: Users@clusterlabs.org 
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] VirtualDomain & parallel shutdown

2018-11-20 Thread Klechomir

Hi list,
Bumped onto the following issue lately:

When ultiple VMs are given shutdown right one-after-onther and the shutdown of 
the first VM takes long, the others aren't being shut down at all before the 
first doesn't stop.

"batch-limit" doesn't seem to affect this.
Any suggestions why this could happen?

Best regards,
Klecho
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Announcing Anvil! m2 v2.0.7

2018-11-20 Thread Kristoffer Grönlund

On Tue, 2018-11-20 at 02:25 -0500, Digimer wrote:
> * https://github.com/ClusterLabs/striker/releases/tag/v2.0.7
> 
> This is the first release since March, 2018. No critical issues are
> know
> or where fixed. Users are advised to upgrade.
> 

Congratulations!

Cheers,
Kristoffer

> Main bugs fixed;
> 
> * Fixed install issues for Windows 10 and 2016 clients.
> * Improved duplicate record detection and cleanup in scan-clustat and
> scan-storcli.
> * Disabled the detection and recovery of 'paused' state servers (it
> caused more trouble than it solved).
> 
> Notable new features;
> * Improved the server boot logic to choose the node with the most
> running servers, all else being equal.
> * Updated UPS power transfer reason alerts from "warning" to "notice"
> level alerts.
> * Added support for EL 6.10.
> 
> Users can upgrade using 'striker-update' from their Striker
> dashboards.
> 
> /sbin/striker/striker-update --local
> /sbin/striker/striker-update --anvil all
> 
> Please feel free to report any issues in the Striker github
> repository.
> 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] pcs constraint order set syntax

2018-11-20 Thread Ivan Devát


Hello Chris,

Dne 19. 11. 18 v 23:32 Chris Miller napsal(a):


Hello,

     I am attempting to add a resource to an existing ordering 
constraint set. The system in question came pre-configured with PCS 
(FreePBX HA), and need to add a resource group (queuemetrics) to the 
ordering constraint set. Before modification, the existing set is as 
follows (output from pcs config --full)


set mysql httpd asterisk sequential=true (id:mysql-httpd-asterisk) 
setoptions kind=Optional (id:freepbx-start-order)


I'm having issues with the "resource order set" command syntax, 
specifically with setting options and IDs. Per the man page and help 
info, the syntax appears as it should be this :


pcs constraint order set mysql httpd asterisk queuemetrics 
sequential=true id=mysql-httpd-asterisk setoptions kind=Optional 
id=freepbx-start-order


according pcs man:
* sequential=true id=mysql-httpd-asterisk are options
* kind=Optional id=freepbx-start-order are constraint_options

Allowed options are: action, require-all, role, sequential.
So id=mysql-httpd-asterisk is not valid.

For constraint_options it is possible to use id. However, `pcs 
constraint order set` is only for creating constraint set. Unfortunately 
it is not possible to update a constraint.


As a workaround you can delete the constraint and create the new one in 
one step. Something like this:


$ pcs cluster cib temp-cib.xml
$ pcs constraint delete freepbx-start-order -f temp-cib.xml
$ pcs constraint order set mysql httpd asterisk queuemetrics 
sequential=true setoptions kind=Optional id=freepbx-start-order -f 
temp-cib.xml

$ pcs cluster cib-push temp-cib.xml




However when running this command I receive the following error :

Call cib_replace failed (-203): Update does not conform to the 
configured schema


I have also tried variations of this syntax, and the ID option 
specifically is ignored and a dynamically generated name is used instead.


I'm not having any luck finding guidance with this specific issue 
online. Thanks in advance for your guidance.


Chris


Ivan


___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Re: Antw: Placing resource based on least load on a node

2018-11-20 Thread Bernd


Am 2018-11-20 09:08, schrieb Michael Schwartzkopff:

Am 20.11.18 um 08:57 schrieb Ulrich Windl:
Michael Schwartzkopff  schrieb am 20.11.2018 um 08:41 
in Nachricht

:

Am 20.11.18 um 08:35 schrieb Bernd:

Am 2018-11-20 08:06, schrieb Ulrich Windl:
Bernd  schrieb am 20.11.2018 um 07:21 
in

Nachricht

:

Hi,

I'd like to run a certain bunch of cronjobs from time to time on 
the
cluster node (four node cluster) that has the lowest load of all 
four

nodes.

The parameters wanted for this system yet to build are

* automatic placement on one of the four nodes (i.e., that with 
the

lowest load)

* in case a node fails, automatically removed from the cluster

* it must only exist a single entity of the cronjob entity running

so this really screams for pacemakter being used as foundation.

However, I'm not sure how to implement the "put onto node with 
least

load" part. I was thinking to use Node Attributes for that, but I
didn't
find any solution "out of the box" for this. Furthermore, as load 
is a
highly volatile value, how can one make sure that all cronjobs are 
run
to the end without being moved to a node that possibly meanwhile 
got a

lower load than the one executing the jobs?

Hi!

Actually I think the last one is the easiest (assuming the cron 
jobs
do not need any resources that are moved): Once a cron job is 
started,
it will run until it ends, whether it's crontab has been moved or 
not.


Despite of that I think cluster software is not ideal when you
actually need load-balancing software.

Regards,
Ulrich

The only resource(s) existing would be the cron "runner".

The point about load balancing is true, yes... so, any idea what to
use instead? Is there already a tool or framework for solving a
problem like this available or do I have to start from scratch? Not
that I'd be too lazy, but what's the use of reinventing the wheel
repeatedly...? ;)

Regards,

Bernd
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Bugs: http://bugs.clusterlabs.org


hi,


I solved this problem years ago. I used the utilization attribute. 
But
you can use any attribute. You have to write an agent that measures 
the
CPU load every X minutes and updates the attribute.  Now you just 
have
to add a location constraint, that starts the resource on the node 
with
the "best" attribute value. The "best" could be lowest CPU usage or 
most

free RAM or whatever you want.


The disadvantage of this solution is that the cluster (i.e. 
pacemaker)
has to recalculate the scores every time you update your attribute. 
That
causes additional load. If you have many resources the interdepend 
that

additional load may be not negligible.

Hi!

Question on this: Is the cluster clever to check only updates of 
attributes that some rule actually uses, or does it re-evaluate 
everything when any attribute changed?



Everytime. That is what causes the load.


My thought was to update a variable stored per node which contains the 
value of the system load avg over the last 15 minutes, which is 
extremely easy to gather. Based on that, crmd could to its job. Every 
ten minutes would be more than sufficient, as it's not a real cluster 
needed here. (Well, this seems to be an extreme rare use case, though.)


Bernd
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Re: Antw: Placing resource based on least load on a node

2018-11-20 Thread Michael Schwartzkopff

Am 20.11.18 um 08:57 schrieb Ulrich Windl:
 Michael Schwartzkopff  schrieb am 20.11.2018 um 08:41 in 
 Nachricht
> :
>> Am 20.11.18 um 08:35 schrieb Bernd:
>>> Am 2018-11-20 08:06, schrieb Ulrich Windl:
>>> Bernd  schrieb am 20.11.2018 um 07:21 in
>>> Nachricht
 :
> Hi,
>
> I'd like to run a certain bunch of cronjobs from time to time on the
> cluster node (four node cluster) that has the lowest load of all four
> nodes.
>
> The parameters wanted for this system yet to build are
>
> * automatic placement on one of the four nodes (i.e., that with the
> lowest load)
>
> * in case a node fails, automatically removed from the cluster
>
> * it must only exist a single entity of the cronjob entity running
>
> so this really screams for pacemakter being used as foundation.
>
> However, I'm not sure how to implement the "put onto node with least
> load" part. I was thinking to use Node Attributes for that, but I
> didn't
> find any solution "out of the box" for this. Furthermore, as load is a
> highly volatile value, how can one make sure that all cronjobs are run
> to the end without being moved to a node that possibly meanwhile got a
> lower load than the one executing the jobs?
 Hi!

 Actually I think the last one is the easiest (assuming the cron jobs
 do not need any resources that are moved): Once a cron job is started,
 it will run until it ends, whether it's crontab has been moved or not.

 Despite of that I think cluster software is not ideal when you
 actually need load-balancing software.

 Regards,
 Ulrich
>>> The only resource(s) existing would be the cron "runner".
>>>
>>> The point about load balancing is true, yes... so, any idea what to
>>> use instead? Is there already a tool or framework for solving a
>>> problem like this available or do I have to start from scratch? Not
>>> that I'd be too lazy, but what's the use of reinventing the wheel
>>> repeatedly...? ;)
>>>
>>> Regards,
>>>
>>> Bernd
>>> ___
>>> Users mailing list: Users@clusterlabs.org 
>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>>
>> hi,
>>
>>
>> I solved this problem years ago. I used the utilization attribute. But
>> you can use any attribute. You have to write an agent that measures the
>> CPU load every X minutes and updates the attribute.  Now you just have
>> to add a location constraint, that starts the resource on the node with
>> the "best" attribute value. The "best" could be lowest CPU usage or most
>> free RAM or whatever you want.
>>
>>
>> The disadvantage of this solution is that the cluster (i.e. pacemaker)
>> has to recalculate the scores every time you update your attribute. That
>> causes additional load. If you have many resources the interdepend that
>> additional load may be not negligible.
> Hi!
>
> Question on this: Is the cluster clever to check only updates of attributes 
> that some rule actually uses, or does it re-evaluate everything when any 
> attribute changed?
>
Everytime. That is what causes the load.





signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] network fencing - azure arm

Re: [ClusterLabs] Antw: Announcing Anvil! m2 v2.0.7

[ClusterLabs] FYI: dlm.service possibly silenty overwritten with DisplayLink driver installer

Re: [ClusterLabs] Pacemaker failed to restart subprocess of host if container also uses pacemaker cluster!

Re: [ClusterLabs] pcs constraint order set syntax

[ClusterLabs] Corosync 3.0 - Alpha5 is available at corosync.org!

Re: [ClusterLabs] Antw: VirtualDomain & parallel shutdown

[ClusterLabs] Antw: VirtualDomain & parallel shutdown

[ClusterLabs] VirtualDomain & parallel shutdown

Re: [ClusterLabs] Announcing Anvil! m2 v2.0.7

Re: [ClusterLabs] pcs constraint order set syntax

Re: [ClusterLabs] Antw: Re: Antw: Placing resource based on least load on a node

Re: [ClusterLabs] Antw: Re: Antw: Placing resource based on least load on a node

13 matches

Site Navigation

Mail list logo

Footer information