[ 
https://issues.apache.org/jira/browse/HDDS-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-10614:
------------------------------------
    Fix Version/s: 1.5.0

> Recon fails to start with used space cannot be negative
> -------------------------------------------------------
>
>                 Key: HDDS-10614
>                 URL: https://issues.apache.org/jira/browse/HDDS-10614
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Recon
>    Affects Versions: 1.4.0
>            Reporter: Kirill Sizov
>            Assignee: Mohammad Arafat Khan
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.5.0
>
>
> Recon fails to start with the following exception. A restart does not help.
> {noformat}
> [qtp2008821270-42945] WARN org.eclipse.jetty.server.HttpChannelState: 
> unhandled due to prior sendError
> javax.servlet.ServletException: javax.servlet.ServletException: 
> java.lang.IllegalArgumentException: used space cannot be negative.
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>         at org.eclipse.jetty.server.Server.handle(Server.java:516)
>         at 
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
>         at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
>         at 
> org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
>         at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
>         at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
>         at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
>         at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>         at 
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: javax.servlet.ServletException: 
> java.lang.IllegalArgumentException: used space cannot be negative.
>         at 
> org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:410)
>         at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:366)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:319)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
>         at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:290)
>         at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:280)
>         at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:184)
>         at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:89)
>         at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:121)
>         at 
> com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:133)
>         at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
>         at 
> org.apache.hadoop.hdds.server.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1648)
>         at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
>         at 
> org.apache.hadoop.hdds.server.http.NoCacheFilter.doFilter(NoCacheFilter.java:48)
>         at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
>         ... 17 more
> Caused by: java.lang.IllegalArgumentException: used space cannot be negative.
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
>         at 
> org.apache.hadoop.hdds.scm.container.placement.metrics.SCMNodeStat.<init>(SCMNodeStat.java:43)
>         at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.getNodeStatInternal(SCMNodeManager.java:973)
>         at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.getNodeStats(SCMNodeManager.java:882)
>         at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.getStats(SCMNodeManager.java:853)
>         at 
> org.apache.hadoop.ozone.recon.api.ClusterStateEndpoint.getClusterState(ClusterStateEndpoint.java:120)
>         at sun.reflect.GeneratedMethodAccessor228.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
>         at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124)
>         at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167)
>         at 
> org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176)
>         at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
>         at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475)
>         at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397)
>         at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
>         at 
> org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255)
>         at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
>         at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
>         at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
>         at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
>         at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
>         at 
> org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
>         at 
> org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234)
>         at 
> org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680)
>         at 
> org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)
>         ... 50 more
> {noformat}
> *Possible reason:*
> Datanodes send reports with negative used space. 
> On the datanode side {{CachingSpaceUsageSource}} has a method 
> {{decrementUsedSpace}} which decrements total used size by a particular 
> value. The total used size is in turn taken from a delegate 
> {{SpaceUsageSource}}. If the latter fails to return a value, 
> {{decrementUsedSpace}} will subtract from 0, hence a negative value.
> A failure to get used size might be caused by smth like:
> {noformat}
> 2024-03-21 13:30:11,613 [DiskUsage-/data3-
> ] WARN org.apache.hadoop.hdds.fs.CachingSpaceUsageSource: Error refreshing 
> space usage for /data3
> java.io.UncheckedIOException: ExitCodeException exitCode=1: du: cannot read 
> directory '/data3/lost+found': Permission denied
>         at org.apache.hadoop.hdds.fs.DU$DUShell.getUsed(DU.java:94)
>         at 
> org.apache.hadoop.hdds.fs.AbstractSpaceUsageSource.time(AbstractSpaceUsageSource.java:56)
>         at org.apache.hadoop.hdds.fs.DU.getUsedSpace(DU.java:63)
>         at 
> org.apache.hadoop.hdds.fs.CachingSpaceUsageSource.refresh(CachingSpaceUsageSource.java:140)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: ExitCodeException exitCode=1: du: cannot read directory 
> '/data3/lost+found': Permission denied
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1068)
>         at org.apache.hadoop.util.Shell.run(Shell.java:957)
>         at org.apache.hadoop.hdds.fs.DU$DUShell.getUsed(DU.java:91)
>         ... 10 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to