[jira] [Commented] (BEAM-3849) SolrIO: Expose connection timeout tuning for writes

Tim Robertson (JIRA) Wed, 14 Mar 2018 14:30:18 -0700

    [ 
https://issues.apache.org/jira/browse/BEAM-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399446#comment-16399446
 ]


Tim Robertson commented on BEAM-3849:
-------------------------------------

Unless I'm mistaken currently defaults are infinite. 

I observed this when running on a Beam/Spark/YARN cluster talking to a SOLR 5.4 
cloud (hdfs) cluster.  In my case the result was Spark task failures due to 
inactivity resulting in redoing the whole partition of the Spark RDD - the root 
cause in this case was in the SOLR server not responding but it seem to me like 
a useful addition to be able to explicitly control failure scenarios rather 
than relying on the retrying of the execution engine. Timing out and then 
explicitly being able to control retry behaviour for the batch (BEAM-3848) 
seemed like sensible additions which I patched in my version.

> SolrIO: Expose connection timeout tuning for writes
> ---------------------------------------------------
>
>                 Key: BEAM-3849
>                 URL: https://issues.apache.org/jira/browse/BEAM-3849
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-solr
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Tim Robertson
>            Assignee: Ismaël Mejía
>            Priority: Minor
>
> A useful addition would be the ability to tune the socket and conncetion 
> timeouts for the underlying solrj client.
> Currently the createHttpClient() uses defaults only.
> This relates to BEAM-3820 and BEAM-3848 which together will help improve 
> stability of jobs doing large (billions of docs) loading of SOLR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-3849) SolrIO: Expose connection timeout tuning for writes

Reply via email to