[
https://issues.apache.org/jira/browse/SLING-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332072#comment-16332072
]
Dirk Rudolph edited comment on SLING-3967 at 1/19/18 10:25 AM:
---------------------------------------------------------------
Another thing that counts in here is guarantee of processing. With introduction
of supporting other systems then sling and with supporting any customisation in
the serialisation of DistributionRequests the time and resources taken to
export a "package" in whether format this will be is undefined. So the bigger
the DistributionRequests are the more likely it might become that creating the
package already fails - which depending on the trigger used might cause the
loss of the entire request.
For example. Using SCD to index solr for binary documents might require parsing
them with tika and sending only the plaintext as result. Depending on documents
to distribute it might make sense to split the DistributionRequest to achieve
an approximated mean of package creation time.
This looks like a change being necessary in the
LocalDistributionPackageExporter only if deep paths are not taken into account.
was (Author: diru):
Another thing that counts in here is guarantee of processing. With introduction
of supporting other systems then sling and with supporting any customisation in
the serialisation of DistributionRequests the time and resources taken to
export a "package" in whether format this will be is undefined. So the bigger
the DistributionRequests are the more likely it might become that creating the
package already fails - which depending on the trigger used might cause the
loss of the entire request.
For example. Using SCD to index solr for binary documents might require parsing
them with tika and sending only the plaintext as result. Depending on documents
to distribute it might make sense to split the DistributionRequest to achieve
an approximated mean of package creation time.
This looks like a change being necessary in the
LocalDistributionPackageExporter only.
> Define replication strategy for big trees
> -----------------------------------------
>
> Key: SLING-3967
> URL: https://issues.apache.org/jira/browse/SLING-3967
> Project: Sling
> Issue Type: Improvement
> Components: Content Distribution
> Reporter: Marius Petria
> Priority: Major
>
> An extreme case for replication is the replication of an entire big tree (for
> example a /content/bigtree/* with GBs of content).
> One should be able to define a way to replicate this in smaller packages such
> that it does not creates too big packages that affect performance.
> Options to do the split:
> - number of nodes (every 100 nodes)
> - size of data (every 100 MB)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)