[
https://issues.apache.org/jira/browse/GEODE-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jakov Varenina resolved GEODE-7989.
-----------------------------------
Fix Version/s: 1.13.0
Resolution: Fixed
> Improve logging of exceptions that happen during execution of backup
> --------------------------------------------------------------------
>
> Key: GEODE-7989
> URL: https://issues.apache.org/jira/browse/GEODE-7989
> Project: Geode
> Issue Type: Improvement
> Reporter: Jakov Varenina
> Assignee: Jakov Varenina
> Priority: Major
> Fix For: 1.13.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> While backup is executed on the servers and fails due to exception e.g.
> "IOException: Not enough space left on device" then this exception (feedback)
> is not propagated to the user of DistributedSystemMXBean.backupAllMembers
> API. It will only get list of members and disk-stores for which backup is
> successfully executed. But it will not have indication what caused backup to
> fail for some members since Exception is not logged on server when using log
> level less than debug (config, warn, ...). It would be good to have at least
> have better logging for following cases:
> 1. Disk where oplogs are saved is to small for new oplog created by Geode
> backup procedure. This step is executed in Geode backup phase
> startDiskStoreBackup . If there is no enough space left on device, Geode will
> log that exception in DEBUG (see below). It would be good to have this logged
> in info or warning log level.
> 2. There is no enough space on disk where oplogs are copied for backup (this
> doesn't need to be the same disk as mentioned before, and it is not same disk
> for our case). This step in Geode is called completeBackup, and it doesn't
> log even debug log if problem appears, but disk stores are reported as
> offline (DiskBackupStatus.getOfflineDiskStores()). It would be good to have
> this exception logged in info or warning log level.
> Exception logged only in debug level:
> java.io.IOException: Not enough space left on device
> at
> org.apache.geode.internal.shared.NativeCallsJNAImpl$POSIXNativeCalls.preBlow(NativeCallsJNAImpl.java:296)
> at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1007)
> at org.apache.geode.internal.cache.Oplog.createCrf(Oplog.java:1073)
> at org.apache.geode.internal.cache.Oplog.<init>(Oplog.java:646)
> at org.apache.geode.internal.cache.Oplog.switchOpLog(Oplog.java:3723)
> at org.apache.geode.internal.cache.Oplog.forceRolling(Oplog.java:3643)
> at
> org.apache.geode.internal.cache.PersistentOplogSet.forceRoll(PersistentOplogSet.java:199)
> at
> org.apache.geode.internal.cache.backup.BackupTask.startDiskStoreBackup(BackupTask.java:274)
> at
> org.apache.geode.internal.cache.backup.BackupTask.startDiskStoreBackups(BackupTask.java:149)
> at
> org.apache.geode.internal.cache.backup.BackupTask.doBackup(BackupTask.java:111)
> at
> org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:82)
> at
> org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)