[ 
https://issues.apache.org/jira/browse/LUCENE-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324920#comment-14324920
 ] 

Steve Rowe commented on LUCENE-6231:
------------------------------------

bq. As frustrating as Robert's "benevolent terrorist" tactics may be

While Robert's attitude toward Lucene may be benevolent, his attitude toward 
his fellow devs is anything but.  Why are we putting up with a terrorist in our 
midst?

bq. I think his technical veto here is valid: adding such retries to our smoke 
tester can only mask real problems, like too many files / too big files in our 
release artifacts

With a request failure rate above 10%, the number of files in our release 
artifacts would have to be reduced to on the order of one for your statement to 
make sense.  And as I've stated numerous times on this issue, *file size is not 
relevant to the problem*.

bq. something wrong with p.a.o breaking HTTP connections (Has anyone notified 
infra? Shouldn't we do so, instead of hiding the root cause?).

Yes, we should, and Jan has done so.  I've provided evidence of the nature and 
scope of the problem on the ticket.

The probability of INFRA fixing the problem in the 5.0 vote window is roughly 
0%. 

bq. And when Robert holds an issue hostage like this, I've also noticed it 
motivates him to work hard to resolve the "real" issue that is his target, and 
net/net the projects moves forward.

Hooray, Robert is such a successful terrorist, plus he works hard!  Onward 
soldiers, friendly fire is fun!

bq. This smoke tester is not something tons and tons of Lucene/Solr users use: 
it is a specialized tool, and it need not be bullet proof. It's only used by a 
handful of us to validate a release. It really ought to be brittle when 
something isn't right... to get our attention so we fix the real problem.

Do you really want to disincentivize running the smoke tester?  Seriously???

> smokeTestRelease.py should retry failed downloads
> -------------------------------------------------
>
>                 Key: LUCENE-6231
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6231
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Steve Rowe
>            Assignee: Steve Rowe
>             Fix For: 5.0, Trunk, 5.1
>
>         Attachments: LUCENE-6231-part-2.patch, LUCENE-6231-part-3.patch, 
> LUCENE-6231-part-3.patch, LUCENE-6231-part-3.patch, LUCENE-6231.patch
>
>
> In the 5.0 RC2 vote thread, [~anshumg] mentioned that 6 attempts at running 
> the smoke tester against the people.apache.org RC URL all failed because of 
> download failures.
> I had the same problem - my first two attempts also failed because of failed 
> downloads - here's the trace from one of them:
> {noformat}
> Traceback (most recent call last):
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 1248, in do_open
>     h.request(req.get_method(), req.selector, req.data, headers)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/client.py",
>  line 1061, in request
>     self._send_request(method, url, body, headers)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/client.py",
>  line 1099, in _send_request
>     self.endheaders(body)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/client.py",
>  line 1057, in endheaders
>     self._send_output(message_body)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/client.py",
>  line 902, in _send_output
>     self.send(msg)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/client.py",
>  line 840, in send
>     self.connect()
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/http/client.py",
>  line 818, in connect
>     self.timeout, self.source_address)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socket.py",
>  line 435, in create_connection
>     raise err
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/socket.py",
>  line 426, in create_connection
>     sock.connect(sa)
> TimeoutError: [Errno 60] Operation timed out
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "dev-tools/scripts/smokeTestRelease.py", line 117, in download
>     fIn = urllib.request.urlopen(urlString)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 156, in urlopen
>     return opener.open(url, data, timeout)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 469, in open
>     response = self._open(req, data)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 487, in _open
>     '_open', req)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 447, in _call_chain
>     result = func(*args)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 1268, in http_open
>     return self.do_open(http.client.HTTPConnection, req)
>   File 
> "/Users/sarowe/homebrew/Cellar/python3/3.3.2/Frameworks/Python.framework/Versions/3.3/lib/python3.3/urllib/request.py",
>  line 1251, in do_open
>     raise URLError(err)
> urllib.error.URLError: <urlopen error [Errno 60] Operation timed out>
> The above exception was the direct cause of the following exception:
> Traceback (most recent call last):
>   File "dev-tools/scripts/smokeTestRelease.py", line 1523, in <module>
>     main()
>   File "dev-tools/scripts/smokeTestRelease.py", line 1468, in main
>     smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, c.is_signed, ' 
> '.join(c.test_args))
>   File "dev-tools/scripts/smokeTestRelease.py", line 1517, in smokeTest
>     checkMaven(baseURL, tmpDir, svnRevision, version, isSigned)
>   File "dev-tools/scripts/smokeTestRelease.py", line 1012, in checkMaven
>     crawl(artifacts[project], artifactsURL, targetDir)
>   File "dev-tools/scripts/smokeTestRelease.py", line 1280, in crawl
>     crawl(downloadedFiles, subURL, path, exclusions)
>   File "dev-tools/scripts/smokeTestRelease.py", line 1280, in crawl
>     crawl(downloadedFiles, subURL, path, exclusions)
>   File "dev-tools/scripts/smokeTestRelease.py", line 1283, in crawl
>     download(text, subURL, targetDir, quiet=True)
>   File "dev-tools/scripts/smokeTestRelease.py", line 139, in download
>     raise RuntimeError('failed to download url "%s"' % urlString) from e
> RuntimeError: failed to download url 
> "http://people.apache.org/~anshum/staging_area/lucene-solr-5.0.0-RC2-rev1658469//lucene/maven/org/apache/lucene/lucene-analyzers-uima/5.0.0/lucene-analyzers-uima-5.0.0.jar.asc";
> {noformat}
> I did a recursive download of the RC2 folder on people.apache.org using wget, 
> and there were three download failures, which wget auto-retried, and 
> succeeded in each case on the second attempt - two of these were timeouts and 
> the third was a reset connection: 
> {noformat}
> HTTP request sent, awaiting response... No data received.
> Retrying.
> [...]
> HTTP request sent, awaiting response... Read error (Connection reset by peer) 
> in headers.
> Retrying.
> {noformat}
> I think we should just automatically retry all failed downloads once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to