[
https://issues.apache.org/jira/browse/SOLR-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shalin Shekhar Mangar updated SOLR-10515:
-----------------------------------------
Attachment: SOLR-10515.patch
Thanks Andrzej. There were a few conflicts in the tests because of my last
commit on SOLR-10738 so I fixed those in this patch. It should apply cleanly on
the autoscaling branch.
A few comments:
# You need to add SuppressForbidden annotation on IdUtils, NodeAddedTrigger and
NodeLostTrigger and their tests due to the use of System.currentTimeMillis
otherwise precommit will complain
# Why TreeSet and TreeMap instead of HashSet and HashMap in the triggers? I
tried to find where we rely on the order anywhere but couldn't.
# While replaying, ScheduledTrigger should check for hasPendingActions first
otherwise it will discard the event for no reason
# TriggerEvent.getEventTime() now returns System.currentTimeMillis but the
NodeAddedTriggerTest (and NodeLostTriggerTest) still compare with
System.nanoTime
# OverseerTriggerThread uses while (!isClosed) but this usage of isClosed is
without obtaining the update lock so the right value may not be visible. I'd
rather keep it as while(true)
# The removal of triggerState should probably be moved from
OverseerTriggerThread to ScheduledTriggers.remove after trigger is closed
otherwise NoNodeException will be needlessly logged if the removed trigger was
executing concurrently. Also I don't see the corresponding removal of the
events path anywhere.
# When a trigger is replaced, we call restoreState(old.trigger) and then when
the trigger is run for the first time, it restores (the same data) from ZK
again. Not a big deal, just something I noticed. We can solve it setting replay
to false if we want to.
# After we deque the event we assert that its id is same as the event we
expected. Can we log if they don't match in addition to the assert?
# The use isClosed in ScheduledTrigger is not safe because the close() method
will always be called by a different thread than the one executing the run()
method. So either make isClosed volatile or an AtomicBoolean.
> Persist intermediate trigger state in ZK to continue tracking information
> across overseer restarts
> --------------------------------------------------------------------------------------------------
>
> Key: SOLR-10515
> URL: https://issues.apache.org/jira/browse/SOLR-10515
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Reporter: Shalin Shekhar Mangar
> Assignee: Andrzej Bialecki
> Labels: autoscaling
> Fix For: master (7.0)
>
> Attachments: SOLR-10515.patch, SOLR-10515.patch
>
>
> The current trigger design is simplistic and keeps all the intermediate state
> in memory. But this presents two problems when the overseer itself fails:
> # We lose tracking state such as which node was added before the overseer
> restarted
> # A nodeLost trigger can never really fire for the overseer node itself
> So we need a way, preferably in the trigger API itself to save intermediate
> state or checkpoints so that it can seamlessly continue on overseer restarts.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]