[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-10 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=89356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89356
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 10/Apr/18 12:04
Start Date: 10/Apr/18 12:04
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-380074408
 
 
   Btw, this looks like closed and not merged on github only because I manually 
merged it, it is already merged.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89356)
Time Spent: 7h 40m  (was: 7.5h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-10 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=89354=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89354
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 10/Apr/18 12:03
Start Date: 10/Apr/18 12:03
Worklog Time Spent: 10m 
  Work Description: iemejia closed pull request #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/java/io/solr/pom.xml b/sdks/java/io/solr/pom.xml
index 53e4d1fa17c..a0d93a26e41 100644
--- a/sdks/java/io/solr/pom.xml
+++ b/sdks/java/io/solr/pom.xml
@@ -55,6 +55,16 @@
 commons-compress
 
 
+
+org.slf4j
+slf4j-api
+
+
+
+joda-time
+joda-time
+
+
 
 
 com.google.auto.value
@@ -101,6 +111,13 @@
 tests
 
 
+
+org.apache.beam
+beam-sdks-java-core
+test
+tests
+
+
 
 org.apache.beam
 beam-runners-direct-java
@@ -129,24 +146,23 @@
 
 
 
-org.slf4j
-slf4j-api
+com.carrotsearch.randomizedtesting
+randomizedtesting-runner
+2.3.2
 test
 
 
 
 com.carrotsearch.randomizedtesting
-randomizedtesting-runner
+junit4-ant
 2.3.2
 test
 
 
 
 org.slf4j
-slf4j-log4j12
-${slf4j.version}
+slf4j-jdk14
 test
 
 
-
 
diff --git 
a/sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java 
b/sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
index 0384417c6f9..b80abf96c8d 100644
--- a/sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
+++ b/sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
@@ -22,6 +22,7 @@
 
 import com.google.auto.value.AutoValue;
 import com.google.common.annotations.VisibleForTesting;
+import com.google.common.collect.ImmutableSet;
 import com.google.common.util.concurrent.MoreExecutors;
 import com.google.common.util.concurrent.ThreadFactoryBuilder;
 import java.io.IOException;
@@ -33,10 +34,12 @@
 import java.util.Iterator;
 import java.util.List;
 import java.util.NoSuchElementException;
+import java.util.Set;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.Executors;
 import java.util.concurrent.Future;
+import java.util.function.Predicate;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.coders.Coder;
@@ -46,6 +49,10 @@
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.ParDo;
 import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.util.BackOff;
+import org.apache.beam.sdk.util.BackOffUtils;
+import org.apache.beam.sdk.util.FluentBackoff;
+import org.apache.beam.sdk.util.Sleeper;
 import org.apache.beam.sdk.values.PBegin;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.sdk.values.PDone;
@@ -62,6 +69,7 @@
 import org.apache.solr.client.solrj.response.QueryResponse;
 import org.apache.solr.client.solrj.response.schema.SchemaResponse;
 import org.apache.solr.common.SolrDocument;
+import org.apache.solr.common.SolrException;
 import org.apache.solr.common.SolrInputDocument;
 import org.apache.solr.common.cloud.ClusterState;
 import org.apache.solr.common.cloud.DocCollection;
@@ -72,6 +80,9 @@
 import org.apache.solr.common.params.CursorMarkParams;
 import org.apache.solr.common.params.ModifiableSolrParams;
 import org.apache.solr.common.util.NamedList;
+import org.joda.time.Duration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * Transforms for reading and writing data from/to Solr.
@@ -110,10 +121,15 @@
  * 
inputDocs.apply(SolrIO.write().to("my-collection").withConnectionConfiguration(conn));
  *
  * }
+ *
+ * When writing it is possible to customise the retry behavior should an 
error be encountered. By
+ * default this is disabled and only one attempt will be made to write to SOLR.
  */
 @Experimental(Experimental.Kind.SOURCE_SINK)
 public class SolrIO {
 
+  private static final Logger LOG = LoggerFactory.getLogger(SolrIO.class);
+
   public static Read read() {
 // 1000 for batch size is good enough in many cases,
 // ex: if 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-10 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=89344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89344
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 10/Apr/18 11:16
Start Date: 10/Apr/18 11:16
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r180146979
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -110,10 +121,15 @@
  * 
inputDocs.apply(SolrIO.write().to("my-collection").withConnectionConfiguration(conn));
  *
  * }
+ *
+ * When writing it is possible to customise the retry behavior should an 
error be encountered. By
 
 Review comment:
   s/should/if


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89344)
Time Spent: 7h 20m  (was: 7h 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-10 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=89343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89343
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 10/Apr/18 11:16
Start Date: 10/Apr/18 11:16
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r180146979
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -110,10 +121,15 @@
  * 
inputDocs.apply(SolrIO.write().to("my-collection").withConnectionConfiguration(conn));
  *
  * }
+ *
+ * When writing it is possible to customise the retry behavior should an 
error be encountered. By
 
 Review comment:
   s/should/if


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89343)
Time Spent: 7h 10m  (was: 7h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-05 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=88203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-88203
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 05/Apr/18 19:24
Start Date: 05/Apr/18 19:24
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-379049114
 
 
   @iemejia - that should be rebased, squashed and presented as one commit.  
Please grab me on slack if you see otherwise and I'll address it.  I'm afraid 
that with a rebase to origin/master to include Romain's recent PRs did not fix 
the thread problem so the `testWriteRetry()` remains with the hack to await 
thread termination, and manual unregistering of `SolrZKClient` from  threadleak 
detection. Perhaps we/Roman/I could diagnose and tackle that later?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 88203)
Time Spent: 7h  (was: 6h 50m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-05 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=88090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-88090
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 05/Apr/18 15:33
Start Date: 05/Apr/18 15:33
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378977996
 
 
   Yes please can you rebase and squash the extra commits please. I will take a 
look just after that (again sorry for my delay).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 88090)
Time Spent: 6h 50m  (was: 6h 40m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-05 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=88070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-88070
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 05/Apr/18 14:43
Start Date: 05/Apr/18 14:43
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378960656
 
 
   @iemejia - I'll test if the recent commits to master fix the hack for 
awaiting thread termination when we come to rebase this PR if that's ok with 
you?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 88070)
Time Spent: 6h 40m  (was: 6.5h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=87623=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87623
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 04/Apr/18 16:28
Start Date: 04/Apr/18 16:28
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378662259
 
 
   Arrghh and that just retriggered it again hahaha sorry.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87623)
Time Spent: 6.5h  (was: 6h 20m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=87622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87622
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 04/Apr/18 16:26
Start Date: 04/Apr/18 16:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378661839
 
 
   @timrobertson100 hehe no actually `retest this please` is the command we use 
to trigger jenkins again :D !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87622)
Time Spent: 6h 20m  (was: 6h 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=87601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87601
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 04/Apr/18 15:51
Start Date: 04/Apr/18 15:51
Worklog Time Spent: 10m 
  Work Description: CaoManhDat commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378650202
 
 
   Sorry, I do not have time to fully review the patch. But through a quick 
glance, it looks good to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87601)
Time Spent: 6h 10m  (was: 6h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=87600=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87600
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 04/Apr/18 15:51
Start Date: 04/Apr/18 15:51
Worklog Time Spent: 10m 
  Work Description: CaoManhDat commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378650202
 
 
   Sorry, I do not have time to fully review the patch. But it looks good to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87600)
Time Spent: 6h  (was: 5h 50m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=87594=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87594
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 04/Apr/18 15:37
Start Date: 04/Apr/18 15:37
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378645016
 
 
   Thanks @iemejia 
   
   > retest this please
   
   Was that meant for me?  If so, can you please elaborate on what you'd like 
me to test? SolrIO tests all pass as far as I can tell.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87594)
Time Spent: 5h 50m  (was: 5h 40m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=87527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87527
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 04/Apr/18 13:00
Start Date: 04/Apr/18 13:00
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-378591109
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87527)
Time Spent: 5h 40m  (was: 5.5h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-04-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=86364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-86364
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 01/Apr/18 11:53
Start Date: 01/Apr/18 11:53
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-377781538
 
 
   @iemejia Please can you take a look?
   
   This now follows an approach similar to the `JdbcIO` testing. A 
`RetryPredicate` is used to determine if an exception is a candidate for 
retrying.  The predicate configuration is package private since it is only 
intended to enable testing with the embedded server instead of mocking.
   
   Please take a careful look at the exception handling in the test as I hit 
other issues with the DirectRunner and object leak tracking.
   
   Please note I changed the log implementation for tests to follow the same of 
the `JdbcIO` too.
   
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 86364)
Time Spent: 5.5h  (was: 5h 20m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-29 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85693=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85693
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 29/Mar/18 15:47
Start Date: 29/Mar/18 15:47
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r178099724
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   Thanks for the quick answer @iemejia - I will do as outlined above.
   
   I'll go for the simple option of initiating retries by writing to a 
non-existing collection.
   
   My thoughts on the version which would stop the server - It would have to be 
a separate test class to not interfer with the other tests. I was thinking of 
starting a monitor thread, running an unbounded stream writing into Solr, 
having the monitor thread query Solr periodically until it sees data and only 
then shut it down.  After this, the retry mechanism would kick in and would 
exhaust retries and then we'd check the logs to verify. While this would be a 
more real-world test and I think could be done robustly, it is probably 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-29 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85677=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85677
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 29/Mar/18 15:15
Start Date: 29/Mar/18 15:15
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r178089696
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   And about the second alternative of stopping the server, I am afraid that 
this introduces some flakiness, so better to have not to do it, It is better to 
have more reproducible (consistent) tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 85677)
Time Spent: 5h 10m  (was: 5h)

> SolrIO: Improve retrying mechanism in client writes
> 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-29 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85675
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 29/Mar/18 15:13
Start Date: 29/Mar/18 15:13
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r178089066
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   @timrobertson100 good idea, I like the fact of being consistent among IOs. 
Agree in this case also about letting the retry in `SolrIO` to not get into 
madness with testing on `AutorizedSolrClient'. Please go ahead !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 85675)
Time Spent: 5h  (was: 4h 50m)

> SolrIO: Improve retrying mechanism in client writes
> 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-29 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85509
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 29/Mar/18 06:21
Start Date: 29/Mar/18 06:21
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only and I see is missing in 
the original PR).
3. I introduce `SLF4J` logging at `WARN` when retries are invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. write to a non 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85422=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85422
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:37
Start Date: 28/Mar/18 22:37
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only and I see is missing in 
the original PR).
3. I introduce `SLF4J` logging at `warn` when retries are invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. write to a non 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85421
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:37
Start Date: 28/Mar/18 22:37
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
3. I introduce `SLF4J` logging at `warn` when retries are invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. write to a non 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85419=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85419
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:36
Start Date: 28/Mar/18 22:36
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
3. I introduce `SLF4J` logging at `warn` when retries are invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. write to a non 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85418
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:35
Start Date: 28/Mar/18 22:35
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
3. I introduce `SLF4J` logging at `warn` when retries are invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. write to a non 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85417=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85417
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:33
Start Date: 28/Mar/18 22:33
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
3. I introduce `SLF4J` logging at `warn` when retries are invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. write to a non 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85416
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:32
Start Date: 28/Mar/18 22:32
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a `RetryConfiguration` is also provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
3. I introduce logging using `SLF4J` logging at `warn` when retries are 
invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
2. 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85415
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 22:32
Start Date: 28/Mar/18 22:32
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177909607
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   How about the following (copied from the `JdbcIO` design and test approach)
   
1. I introduce a configurable `RetryStrategy` that can be added to the 
`Write` and will come into effect when a the `RetryConfiguration` is also 
provided
2. A `DefaultRetryStrategy` will retry on 
`HttpSolrClient.RemoteSolrException`, `SolrServerException`, `IOException` and 
also on `SolrException` (where the code is `5xx` only) and I see is missing in 
the original PR.
3. I introduce logging using `SLF4J` logging at `warn` when retries are 
invoked
4. I run the mini-IT style tests using the embedded Solr server with 
1. a custom `RetryStrategy` to return `true`(i.e. retry) on any 
`SolrException`
   

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85295
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 15:37
Start Date: 28/Mar/18 15:37
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177794329
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   @timrobertson100 great idea to move this logic into the 
`AuthorizedSolrClient` class, way cleaner. About the tests I am still hesitant 
about the best way to do it because I am also not a big fan of mocks in 
particular in this case because we have already an embedded version available, 
but I agree that simulating the unavailability is not trivial.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85187=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85187
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 10:19
Start Date: 28/Mar/18 10:19
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177703113
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   Thanks @echauchot. In this case what I'm trying to test is "does the code 
that wraps the SOLR client deal with exceptions raised by falling into a retry 
mechanism". Wouldn't you describe that as an algorithm test?  I'm all for IT as 
well but in this instance, I'd propose it doesn't call for one as we'd have to 
either intercept HTTP or have fake / modified servers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=85152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85152
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 28/Mar/18 07:02
Start Date: 28/Mar/18 07:02
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177654144
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   @timrobertson100 I did not see the code, just wanted to comment on your last 
comment: +1 to moving the retry mecanism.
   Also, a general comment about mocks, I'm convinced that mocks are fine to 
test algorithm but not to fake a middleware otherwise you want have a 
compelling test. using embeded middlewares for UT is better to be closer from 
real at a small scale. For larger scale IT are mandatory because some cases 
like cache or timeouts etc...  only raise at a larger scale especially in big 
data projects 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-27 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84975
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 27/Mar/18 18:40
Start Date: 27/Mar/18 18:40
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177507258
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   @iemejia - How about I push the retry mechanism into the 
`AuthorizedSolrClient`.  The benefits would be that it pollutes the SolrIO less 
and can then be tested in isolation of the DoFn and Pipeline as a simple unit 
test.  A new method such as this would be an option:
   
   https://gist.github.com/timrobertson100/9d653ab82af74f6b709f2bbf25cafa00 



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

   

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-27 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84974
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 27/Mar/18 18:40
Start Date: 27/Mar/18 18:40
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177507258
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   @iemejia - How about I push the retry mechanism into the 
`AuthorizedSolrClient`.  The benefits would be that it pollutes the SolrIO less 
and can then be tested in isolation of the DoFn and Pipeline as a simple unit 
test?  A new method such as this would be an option:
   
   https://gist.github.com/timrobertson100/9d653ab82af74f6b709f2bbf25cafa00 



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

   

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-27 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84949
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 27/Mar/18 17:27
Start Date: 27/Mar/18 17:27
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177507258
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   @iemejia - How about I explore moving the retrying behaviour into the 
`AuthorizedSolrClient` so that it pollutes the SolrIO less and can be tested in 
isolation of the DoFn and Pipeline? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84949)
Time Spent: 2h 40m  (was: 2.5h)

> SolrIO: Improve retrying mechanism in client writes
> 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-27 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84891=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84891
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 27/Mar/18 14:19
Start Date: 27/Mar/18 14:19
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177441773
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   I'm struggling to find a way to accommodate this.  
   
   The original design was a simple unit test to isolate the `batch()` flushing 
behaviour and then mock the underlying `SolrClient` simulating failure 
scenarios. The tests then ensured that for a single batch flush, depending on 
the configuration, the `SolrClient` was called the correct number of times 
until retries where exhausted (time bound or N).
   
   Moving this up to a pipeline test introduces 1) an issue with how to 
simulate failures as I don't (yet) find an easy way to mock the SolrClient (you 
can't `Mockito.spy()` on the configuration object) and 2) you loose guarantee 
of the number of times a batch flush or SolrClient should be called 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84550
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 20:31
Start Date: 26/Mar/18 20:31
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177224620
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -661,25 +746,51 @@ public void processElement(ProcessContext context) 
throws Exception {
 SolrInputDocument document = context.element();
 batch.add(document);
 if (batch.size() >= spec.getMaxBatchSize()) {
-  flushBatch();
+  flushBatch(solrClient, batch);
 
 Review comment:
   Maybe if you change the tests as suggested below to not test the Fn directly 
you may avoid passing solrClient at all.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84550)
Time Spent: 2h 20m  (was: 2h 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84421
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 16:00
Start Date: 26/Mar/18 16:00
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#4905: [BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177145422
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   Thank you very much for the review and guidance @iemejia.  I'll try and 
accommodate this tomorrow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84421)
Time Spent: 2h 10m  (was: 2h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84392
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177098150
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -661,25 +746,51 @@ public void processElement(ProcessContext context) 
throws Exception {
 SolrInputDocument document = context.element();
 batch.add(document);
 if (batch.size() >= spec.getMaxBatchSize()) {
-  flushBatch();
+  flushBatch(solrClient, batch);
 
 Review comment:
   I suppose the new solrClient parameter in this method is to be able to test 
it, if this is the case I would prefer that we remove it from there and expose 
it as a package private method in the WriteFn class with the 
`@VisibleForTesting` annotation. Hope it does not make the mocking of the tests 
too complicated, but it is just to let the internal state hidden.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84392)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84399=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84399
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177105637
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -110,9 +115,19 @@
  * 
inputDocs.apply(SolrIO.write().to("my-collection").withConnectionConfiguration(conn));
  *
  * }
+ *
+ * When writing it is possible to customise the retry behavior should an 
error be encountered. By
+ * default this is disabled and only one attempt will be made to write to SOLR.
  */
 @Experimental(Experimental.Kind.SOURCE_SINK)
 public class SolrIO {
+  // The minimum pause when entering retry
+  private static final Duration MIN_BACKOFF_SECONDS = 
Duration.standardSeconds(1);
 
 Review comment:
   Maybe 5 seconds, no, isn't 1 second a bit short?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84399)
Time Spent: 2h  (was: 1h 50m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84396
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177096819
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -623,11 +683,36 @@ Write withMaxBatchSize(int batchSize) {
   return builder().setMaxBatchSize(batchSize).build();
 }
 
+/**
+ * Provide configuration for enabling the retrying of a failed batch call 
to Solr. A batch is
 
 Review comment:
   s/for enabling the retrying of/to retry


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84396)
Time Spent: 1h 50m  (was: 1h 40m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84391
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177063827
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -110,9 +115,19 @@
  * 
inputDocs.apply(SolrIO.write().to("my-collection").withConnectionConfiguration(conn));
  *
  * }
+ *
+ * When writing it is possible to customise the retry behavior should an 
error be encountered. By
+ * default this is disabled and only one attempt will be made to write to SOLR.
  */
 @Experimental(Experimental.Kind.SOURCE_SINK)
 public class SolrIO {
+  // The minimum pause when entering retry
+  private static final Duration MIN_BACKOFF_SECONDS = 
Duration.standardSeconds(1);
+
+  // defaults to a single attempt with a long timeout (years)
+  @VisibleForTesting
+  static final RetryConfiguration DEFAULT_RETRY_CONFIGURATION =
 
 Review comment:
   Given that this is only used for the write. Can we make this internal of the 
Write part (just being a bit nitpicky to reduce the scope). (Same for 
MIN_BACKOFF_SECONDS)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84391)
Time Spent: 1.5h  (was: 1h 20m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84394
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177097352
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -623,11 +683,36 @@ Write withMaxBatchSize(int batchSize) {
   return builder().setMaxBatchSize(batchSize).build();
 }
 
+/**
+ * Provide configuration for enabling the retrying of a failed batch call 
to Solr. A batch is
+ * considered as failed if the underlying {@link CloudSolrClient} surfaces 
{@link
+ * org.apache.solr.client.solrj.impl.HttpSolrClient.RemoteSolrException}, 
{@link
+ * SolrServerException} or {@link IOException}. Users should consider that 
retrying might
+ * compound the underlying problem which caused the initial failure. Users 
should also be aware
+ * that once retrying is exhausted the error is surfaced to the runner 
which may then
+ * opt to retry the current partition in entirety. Retrying uses an 
exponential backoff
 
 Review comment:
   After '... entirety' you can add 'or abort if the max number of retries of 
the runner is completed' (obvious but better explicit).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84394)
Time Spent: 1h 40m  (was: 1.5h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84393
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177131745
 
 

 ##
 File path: 
sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java
 ##
 @@ -263,4 +276,109 @@ public void testSplit() throws Exception {
 // therefore, can not exist an empty shard.
 assertEquals("Wrong number of empty splits", expectedNumSplits, 
nonEmptySplits);
   }
+
+  /**
+   * Ensure that the retrying is ignored under success conditions.
+   */
+  @Test
+  public void testWriteDefaultRetrySuccess() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(SolrIO.RetryConfiguration.create(10, 
Duration.standardSeconds(10)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate success
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenReturn(mock(SolrResponse.class));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+writeFn.flushBatch(solrClient, batch);
+verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+  }
+
+  /**
+   * Ensure that the default retrying behavior surfaces errors immediately 
under failure conditions.
+   */
+  @Test
+  public void testWriteRetryFail() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+
when(write.getRetryConfiguration()).thenReturn(SolrIO.DEFAULT_RETRY_CONFIGURATION);
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(new SolrServerException("Fail"));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  verify(solrClient, times(1)).process(any(String.class), 
any(SolrRequest.class));
+}
+  }
+
+  /**
+   * Ensure that a time bounded retrying is observed.
+   */
+  @Test
+  public void testWriteRetryTimeBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
+when(write.getRetryConfiguration())
+.thenReturn(
+SolrIO.RetryConfiguration.create(Integer.MAX_VALUE, 
Duration.standardSeconds(3)));
+SolrIO.Write.WriteFn writeFn = new SolrIO.Write.WriteFn(write);
+AuthorizedSolrClient solrClient = mock(AuthorizedSolrClient.class);
+
+// simulate failure
+when(solrClient.process(any(String.class), any(SolrRequest.class)))
+.thenThrow(
+new HttpSolrClient.RemoteSolrException(
+"localhost", 1, "ignore", new IOException("Network")));
+
+List batch = SolrIOTestUtils.createDocuments(1);
+Stopwatch stopwatch = Stopwatch.createStarted();
+
+try {
+  writeFn.flushBatch(solrClient, batch);
+  fail("Error should have been surfaced when flushing batch");
+} catch (IOException e) {
+  // at least two attempts must be made
+  verify(solrClient, Mockito.atLeast(2)).process(any(String.class), 
any(SolrRequest.class));
+  long seconds = stopwatch.elapsed(TimeUnit.SECONDS);
+  assertTrue(
+  "Retrying should have executed for at least 3 seconds but was " + 
seconds,
+  seconds >= 3);
+}
+  }
+
+  /**
+   * Ensure that retries are initiated up to a limited number.
+   */
+  @Test
+  public void testWriteRetryAttemptBound() throws Exception {
+SolrIO.Write write = mock(SolrIO.Write.class);
 
 Review comment:
   Recent guidelines for IOs encourage to test more at the PTransform level 
than at the DoFn one which means better use TestPipeline + PAssert et co. Not 
sure it would be 'easy' to cover these cases like that, but maybe worth a try, 
at least this one seems straightforward.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84393)
Time Spent: 1h 40m  (was: 1.5h)

> SolrIO: Improve 

[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-26 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=84395=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-84395
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 26/Mar/18 15:26
Start Date: 26/Mar/18 15:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on a change in pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#discussion_r177105637
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -110,9 +115,19 @@
  * 
inputDocs.apply(SolrIO.write().to("my-collection").withConnectionConfiguration(conn));
  *
  * }
+ *
+ * When writing it is possible to customise the retry behavior should an 
error be encountered. By
+ * default this is disabled and only one attempt will be made to write to SOLR.
  */
 @Experimental(Experimental.Kind.SOURCE_SINK)
 public class SolrIO {
+  // The minimum pause when entering retry
+  private static final Duration MIN_BACKOFF_SECONDS = 
Duration.standardSeconds(1);
 
 Review comment:
   Maybe 5 seconds, no, isn't 1 second a bit short, no?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 84395)
Time Spent: 1h 40m  (was: 1.5h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=83806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83806
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:54
Start Date: 23/Mar/18 20:54
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-375795482
 
 
   @timrobertson100 my excuses I was quite busy with other stuff I will take a 
look and give you some feedback at the beginning of next week at latest. Thanks 
for the contribution and for your interest in improving SolrIO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83806)
Time Spent: 1h 20m  (was: 1h 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=83798=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83798
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:36
Start Date: 23/Mar/18 20:36
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-375790839
 
 
   Thanks! Looks generally reasonable to me, but @iemejia would you mind doing 
the first detailed round here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83798)
Time Spent: 1h 10m  (was: 1h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-22 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=83251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83251
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 22/Mar/18 17:14
Start Date: 22/Mar/18 17:14
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-375386185
 
 
   Tim - I've been traveling past few days (including today), I'll get to this 
in the next few days after sorting out some of my backlog.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83251)
Time Spent: 1h  (was: 50m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=82698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82698
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 21/Mar/18 09:33
Start Date: 21/Mar/18 09:33
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-374878432
 
 
   @timrobertson100 I did not notice you were working on the 2 similar tickets. 
It is better indeed. More efficient !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 82698)
Time Spent: 50m  (was: 40m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=82689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82689
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 21/Mar/18 09:01
Start Date: 21/Mar/18 09:01
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on issue #4905: [BEAM-3848] 
Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-374870992
 
 
   Thanks @echauchot. I'll provide a PR for BEAM-3026 (retry in 
ElasticsearchIO) if this one completes - being my first contribution I'm 
expecting to have to make changes though.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 82689)
Time Spent: 40m  (was: 0.5h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=82684=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82684
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 21/Mar/18 08:53
Start Date: 21/Mar/18 08:53
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-374869074
 
 
   Nice ! It gives me pointers on how to implement the same thing for ES too :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 82684)
Time Spent: 0.5h  (was: 20m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-20 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=82346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82346
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 20/Mar/18 17:21
Start Date: 20/Mar/18 17:21
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-374683572
 
 
   @CaoManhDat If you have some time maybe worth your comments too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 82346)
Time Spent: 20m  (was: 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-20 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=82290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82290
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 20/Mar/18 14:57
Start Date: 20/Mar/18 14:57
Worklog Time Spent: 10m 
  Work Description: timrobertson100 opened a new pull request #4905: 
[BEAM-3848] Enables ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905
 
 
   This adds retying behavior when a failure writing to SOLR occurs.  
   
   The caller can provide max attempts and max duration for retrying. Default 
behavior should be unchanged, and the code follows the same pattern as other 
IOs (e.g. JdbcIO) for the expontential backoff.
   
   Please note that the `SolrIO.Write.WriteFn.flushBatch()` signature was 
modified to allow unit testing.
   
   [CC @jfkk @iemejia @echauchot]


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 82290)
Time Spent: 10m
Remaining Estimate: 0h

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)