[
https://issues.apache.org/jira/browse/BEAM-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399530#comment-16399530
]
Eugene Kirpichov commented on BEAM-3820:
----------------------------------------
The lack of control is one of the biggest reasons why we are strongly against
tunable parameters - unless the user has full control over the environment
(which they don't - they don't control the runner or the implementation of the
PTransform), they can not assume that setting the parameter to a particular
value will reliably achieve a particular effect. See discussion on
https://github.com/apache/beam/pull/4461 for more elaboration - I'm happy to
add this argument to
https://beam.apache.org/contribute/ptransform-style-guide/#what-parameters-to-expose
which already recommends against tunable parameters.
Could you elaborate about the tuning of batch sizes that you needed to do to
get your job to work, on top of the retry strategy (which I agree needs to be
implemented)? Perhaps we can make SolrIO do it automatically. If it turns out
that there's no way to automatically set a value that will work reliably for
all users, then that falls under "unless it’s impossible to automatically
supply or compute a good-enough value" and then we can add it. But we should
not do so lightly.
> SolrIO: Allow changing batchSize for writes
> -------------------------------------------
>
> Key: BEAM-3820
> URL: https://issues.apache.org/jira/browse/BEAM-3820
> Project: Beam
> Issue Type: Improvement
> Components: io-java-solr
> Affects Versions: 2.2.0, 2.3.0
> Reporter: Tim Robertson
> Assignee: Ismaël Mejía
> Priority: Trivial
>
> The SolrIO hard codes the batchSize for writes at 1000. It would be a good
> addition to allow the user to set the batchSize explicitly (similar to the
> ElasticsearchIO)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)