[
https://issues.apache.org/jira/browse/HDDS-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749956#comment-16749956
]
Yiqun Lin commented on HDDS-989:
--------------------------------
Looks good to me overall, some comments:
* Can we rename the name {{numSyncChecks}} to {{numAllVolumesChecks}}? This
seems more readable and be consistent with the operation name.
* One comment for following logic:
{code:java}
+
+ final AtomicLong numVolumes = new AtomicLong(volumes.size());
+ final CountDownLatch latch = new CountDownLatch(1); --> why not be
CountDownLatch(volumes.size())?
+
+ for (HddsVolume v : volumes) {
+ Optional<ListenableFuture<VolumeCheckResult>> olf =
+ delegateChecker.schedule(v, null);
+ LOG.info("Scheduled health check for volume {}", v);
+ if (olf.isPresent()) {
+ allVolumes.add(v);
+ Futures.addCallback(olf.get(),
+ new ResultHandler(v, healthyVolumes, failedVolumes,
+ numVolumes, (ignored1, ignored2) -> latch.countDown()));
+ } else {
+ if (numVolumes.decrementAndGet() == 0) {
+ latch.countDown();
+ }
+ }
+ }
{code}
* Can we print a error log with the exception here?
{code:java}
+ this.periodicDiskChecker =
+ diskCheckerservice.scheduleWithFixedDelay(() -> {
+ try {
+ checkAllVolumes();
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ }, DISK_CHECK_INTERVAL_MINUTES, DISK_CHECK_INTERVAL_MINUTES,
+ TimeUnit.MINUTES);
{code}
> Check Hdds Volumes for errors
> -----------------------------
>
> Key: HDDS-989
> URL: https://issues.apache.org/jira/browse/HDDS-989
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Components: Ozone Datanode
> Reporter: Arpit Agarwal
> Assignee: Arpit Agarwal
> Priority: Major
> Attachments: HDDS-989.01.patch, HDDS-989.02.patch
>
>
> HDDS volumes should be checked for errors periodically.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]