[ 
https://issues.apache.org/jira/browse/SLING-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332072#comment-16332072
 ] 

Dirk Rudolph commented on SLING-3967:
-------------------------------------

Another thing that counts in here is guarantee of processing. With introduction 
of supporting other systems then sling and with supporting any customisation in 
the serialisation of DistributionRequests the time and resources taken to 
export a "package" in whether format this will be is undefined. So the bigger 
the DistributionRequests are the more likely it might become that creating the 
package already fails - which depending on the trigger used might cause the 
loss of the entire request. 

For example. Using SCD to index solr for binary documents might require parsing 
them with tika and sending only the plaintext as result. Depending on documents 
to distribute it might make sense to split the DistributionRequest to achieve 
an approximated mean of package creation time.

> Define replication strategy for big trees
> -----------------------------------------
>
>                 Key: SLING-3967
>                 URL: https://issues.apache.org/jira/browse/SLING-3967
>             Project: Sling
>          Issue Type: Improvement
>          Components: Content Distribution
>            Reporter: Marius Petria
>            Priority: Major
>
> An extreme case for replication is the replication of an entire big tree (for 
> example a /content/bigtree/* with GBs of content).
> One should be able to define a way to replicate this in smaller packages such 
> that it does not creates too big packages that affect performance.
> Options to do the split:
> - number of nodes (every 100 nodes)
> - size of data (every 100 MB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to