[jira] [Updated] (SLING-9030) SimpleDistributionAgentQueueProcessor does not distinguish between recoverable and non-recoverable exceptions

Mohit Arora (Jira) Mon, 27 Jan 2020 08:03:26 -0800


     [ 
https://issues.apache.org/jira/browse/SLING-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mohit Arora updated SLING-9030:
-------------------------------
    Description: 
[SimpleDistributionAgentQueueProcessor.java|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/agent/impl/SimpleDistributionAgentQueueProcessor.java#L83]
 is responsible for processing a queueItem which is then passed on to 
[RemoteDistributionPackageImporter#importPackage()|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/packaging/impl/importer/RemoteDistributionPackageImporter.java#L59]
 which in turn selects a valid transporter and send the POST request through 
[SimpleHttpDistributionTransport#deliverPackage()|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/transport/impl/SimpleHttpDistributionTransport.java#L108].
 There can be 2 types of exceptions thrown by this deliverPackage() function. 
One is a 
[RecoverableDistributionException|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/common/RecoverableDistributionException.java]
 which is a type of DistributionException and another is DistributionException. 
As the name suggests, a RecoverableDistributionException is where the transport 
is tried again. But it seems there is currently no cap on the number of retries.

For example, if the endpoint is not accessible at the moment, the error logs of 
the caller application will be flooded with constant retries of the 
DistributionPackages in queue, until the endpoint comes up and the distribution 
is successful. The reason being, [the verbose logging done 
here|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/agent/impl/SimpleDistributionAgentQueueProcessor.java#L100]
 does not distinguish between a RecoverableDistributionException and a normal 
DistributionException. This would lead to sharp increase in disk size of the 
caller application. Perhaps the logging can be less verbose and can be logged 
at *WARN* level for RecoverableDistributionException.

cc - [~ashishc], [~marett]

  was:
[SimpleDistributionAgentQueueProcessor.java|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/agent/impl/SimpleDistributionAgentQueueProcessor.java#L83]
 is responsible for processing a queueItem which is then passed on to 
[RemoteDistributionPackageImporter#importPackage()|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/packaging/impl/importer/RemoteDistributionPackageImporter.java#L59]
 which in turn selects a valid transporter and send the POST request through 
[SimpleHttpDistributionTransport#deliverPackage|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/transport/impl/SimpleHttpDistributionTransport.java#L108].
 There can be 2 type of exceptions thrown by this deliverPackage() function. 
One is a 
[RecoverableDistributionException|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/common/RecoverableDistributionException.java]
 which is a type of DistributionException and another is DistributionException. 
As the name suggests, a RecoverableDistributionException is where the transport 
is tried again. But it seems there is currently no cap on the number of retries.

For example, if the endpoint is not accessible at the moment, the error logs of 
the caller application will be flooded with constant retries of the 
DistributionPackages in queue, until the endpoint comes up and the distribution 
is successful. The reason being, [the verbose logging done 
here|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/agent/impl/SimpleDistributionAgentQueueProcessor.java#L100]
 does not distinguish between a RecoverableDistributionException and a normal 
DistributionException. This would lead to sharp increase in disk size of the 
caller application. Perhaps the logging can be less verbose and can be logged 
at *WARN* level for RecoverableDistributionException.

cc - [~ashishc], [~marett]


> SimpleDistributionAgentQueueProcessor does not distinguish between 
> recoverable and non-recoverable exceptions
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SLING-9030
>                 URL: https://issues.apache.org/jira/browse/SLING-9030
>             Project: Sling
>          Issue Type: Bug
>          Components: Content Distribution
>            Reporter: Mohit Arora
>            Priority: Major
>             Fix For: Content Distribution Core 0.4.2
>
>
> [SimpleDistributionAgentQueueProcessor.java|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/agent/impl/SimpleDistributionAgentQueueProcessor.java#L83]
>  is responsible for processing a queueItem which is then passed on to 
> [RemoteDistributionPackageImporter#importPackage()|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/packaging/impl/importer/RemoteDistributionPackageImporter.java#L59]
>  which in turn selects a valid transporter and send the POST request through 
> [SimpleHttpDistributionTransport#deliverPackage()|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/transport/impl/SimpleHttpDistributionTransport.java#L108].
>  There can be 2 types of exceptions thrown by this deliverPackage() function. 
> One is a 
> [RecoverableDistributionException|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/common/RecoverableDistributionException.java]
>  which is a type of DistributionException and another is 
> DistributionException. As the name suggests, a 
> RecoverableDistributionException is where the transport is tried again. But 
> it seems there is currently no cap on the number of retries.
> For example, if the endpoint is not accessible at the moment, the error logs 
> of the caller application will be flooded with constant retries of the 
> DistributionPackages in queue, until the endpoint comes up and the 
> distribution is successful. The reason being, [the verbose logging done 
> here|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/agent/impl/SimpleDistributionAgentQueueProcessor.java#L100]
>  does not distinguish between a RecoverableDistributionException and a normal 
> DistributionException. This would lead to sharp increase in disk size of the 
> caller application. Perhaps the logging can be less verbose and can be logged 
> at *WARN* level for RecoverableDistributionException.
> cc - [~ashishc], [~marett]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (SLING-9030) SimpleDistributionAgentQueueProcessor does not distinguish between recoverable and non-recoverable exceptions

Reply via email to