[ https://issues.apache.org/jira/browse/MRESOLVER-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tamas Cservenak updated MRESOLVER-404: -------------------------------------- Description: Originally (for today, see below) Hazelcast NamedLock implementation worked like this: * on lock acquire, an ISemaphore DO with lock name is created (or just get, if exists), is refCounted * on lock release, if refCount shows 0 = uses, ISemaphore was destroyed (releasing HZ cluster resources) * if after some time, a new lock acquire happened for same name, ISemaphore DO would get re-created. Today, HZ NamedLocks implementation works in following way: * there is only one Semaphore provider implementation, the {{DirectHazelcastSemaphoreProvider}} that maps lock name 1:1 onto ISemaphore Distribute Object (DO) name and does not destroys the DO Reason for this is historical: originally, named locks precursor code was done for Hazelcast 2/3, that used "unreliable" distributed operations, and recreating previously destroyed DO was possible (at the cost of "unreliability"). Since Hazelcast 4.x it updated to RAFT consensus algorithm and made things reliable, it was at the cost that DOs once created, then destroyed, could not be recreated anymore. This change was applied to {{DirectHazelcastSemaphoreProvider}} as well, by simply not dropping unused ISemaphores (release semaphore is no-op method). But, this has an important consequence: a long running Hazelcast cluster will have more and more ISemaphore DOs (basically as many as many Artifacts all the builds that use this cluster to coordinate). Artifacts count existing out there is not infinite, but is large enough -- especially if cluster shared across many different/unrelated builds -- to grow over sane limit. So, current recommendation is to have "large enough" dedicated Hazelcast cluster and use {{semaphore-hazelcast-client}} (that is a "thin client" that connects to cluster), instead of {{semaphore-hazelcast}} (that is "thick client", so puts burden onto JVM process running it as node, hence Maven as well). But even then, regular reboot of cluster may be needed. A proper but somewhat complicated solution would be to introduce some sort of indirection: create as many ISemaphore as needed at the moment, and map those onto locks names in use at the moment (and reuse unused semaphores). Problem is, that mapping would need to be distributed as well (so all clients pick them up, or perform new mapping), and this may cause performance penalty. But this could be proved by exhaustive perf testing only. was: Originally (and even today, but see below) Hazelcast NamedLock implementation worked like this: * on lock acquire, an ISemaphore DO is created (or just get if exists), is refCounted * on lock release, if refCount shows = uses, ISemaphore was destroyed (releasing HZ cluster resources) * if after some time, a new lock acquire happened for same name, ISemaphore would get re-created. Today, HZ NamedLocks implementation works in following way: * there is only one Semaphore provider implementation, the {{DirectHazelcastSemaphoreProvider}} that maps lock name 1:1 onto ISemaphore Distribute Object (DO) name and does not destroys the DO Reason for this is historical: originally, named locks precursor code was done for Hazelcast 2/3, that used "unreliable" distributed operations, and recreating previously destroyed DO was possible (at the cost of "unreliability"). Since Hazelcast 4.x it updated to RAFT consensus algorithm and made things reliable, it was at the cost that DOs once created, then destroyed, could not be recreated anymore. This change was applied to {{DirectHazelcastSemaphoreProvider}} as well, by simply not dropping unused ISemaphores (release semaphore is no-op method). But, this has an important consequence: a long running Hazelcast cluster will have more and more ISemaphore DOs (basically as many as many Artifacts all the builds that use this cluster to coordinate). Artifacts count existing out there is not infinite, but is large enough -- especially if cluster shared across many different/unrelated builds -- to grow over sane limit. So, current recommendation is to have "large enough" dedicated Hazelcast cluster and use {{semaphore-hazelcast-client}} (that is a "thin client" that connects to cluster), instead of {{semaphore-hazelcast}} (that is "thick client", so puts burden onto JVM process running it as node, hence Maven as well). But even then, regular reboot of cluster may be needed. A proper but somewhat complicated solution would be to introduce some sort of indirection: create as many ISemaphore as needed at the moment, and map those onto locks names in use at the moment (and reuse unused semaphores). Problem is, that mapping would need to be distributed as well (so all clients pick them up, or perform new mapping), and this may cause performance penalty. But this could be proved by exhaustive perf testing only. > New strategy for Hazelcast named locks > -------------------------------------- > > Key: MRESOLVER-404 > URL: https://issues.apache.org/jira/browse/MRESOLVER-404 > Project: Maven Resolver > Issue Type: Improvement > Components: Resolver > Reporter: Tamas Cservenak > Priority: Major > > Originally (for today, see below) Hazelcast NamedLock implementation worked > like this: > * on lock acquire, an ISemaphore DO with lock name is created (or just get, > if exists), is refCounted > * on lock release, if refCount shows 0 = uses, ISemaphore was destroyed > (releasing HZ cluster resources) > * if after some time, a new lock acquire happened for same name, ISemaphore > DO would get re-created. > Today, HZ NamedLocks implementation works in following way: > * there is only one Semaphore provider implementation, the > {{DirectHazelcastSemaphoreProvider}} that maps lock name 1:1 onto ISemaphore > Distribute Object (DO) name and does not destroys the DO > Reason for this is historical: originally, named locks precursor code was > done for Hazelcast 2/3, that used "unreliable" distributed operations, and > recreating previously destroyed DO was possible (at the cost of > "unreliability"). > Since Hazelcast 4.x it updated to RAFT consensus algorithm and made things > reliable, it was at the cost that DOs once created, then destroyed, could not > be recreated anymore. This change was applied to > {{DirectHazelcastSemaphoreProvider}} as well, by simply not dropping unused > ISemaphores (release semaphore is no-op method). > But, this has an important consequence: a long running Hazelcast cluster will > have more and more ISemaphore DOs (basically as many as many Artifacts all > the builds that use this cluster to coordinate). Artifacts count existing out > there is not infinite, but is large enough -- especially if cluster shared > across many different/unrelated builds -- to grow over sane limit. > So, current recommendation is to have "large enough" dedicated Hazelcast > cluster and use {{semaphore-hazelcast-client}} (that is a "thin client" that > connects to cluster), instead of {{semaphore-hazelcast}} (that is "thick > client", so puts burden onto JVM process running it as node, hence Maven as > well). But even then, regular reboot of cluster may be needed. > A proper but somewhat complicated solution would be to introduce some sort of > indirection: create as many ISemaphore as needed at the moment, and map those > onto locks names in use at the moment (and reuse unused semaphores). Problem > is, that mapping would need to be distributed as well (so all clients pick > them up, or perform new mapping), and this may cause performance penalty. But > this could be proved by exhaustive perf testing only. -- This message was sent by Atlassian Jira (v8.20.10#820010)