[
https://issues.apache.org/jira/browse/SPARK-22851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16300740#comment-16300740
]
John Brock edited comment on SPARK-22851 at 12/21/17 11:59 PM:
---------------------------------------------------------------
I think the inconsistent behavior in Chrome is due to different headers being
sent back from the mirrors:
{code:none}
> curl -I
> http://apache.mirrors.pair.com/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
HTTP/1.1 200 OK
Date: Thu, 21 Dec 2017 23:46:58 GMT
Server: Apache/2.2.29
Last-Modified: Sat, 25 Nov 2017 02:44:26 GMT
ETag: "32b662-bfa03c4-55ec5a5c358a1"
Accept-Ranges: bytes
Content-Length: 200934340
Content-Type: application/x-tar
Content-Encoding: x-gzip
> curl -I
> http://apache.cs.utah.edu/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
HTTP/1.1 200 OK
Date: Thu, 21 Dec 2017 23:47:19 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Sat, 25 Nov 2017 02:44:26 GMT
ETag: "2ae630-bfa03c4-55ec5a5c0d680"
Accept-Ranges: bytes
Content-Length: 200934340
Content-Type: application/x-gzip
{code}
Note that for the first mirror above, {{Content-Type}} is
{{application/x-tar}}, and {{Content-Encoding}} is {{x-gzip}}. For the second
mirror above, {{Content-Type}} is {{applicaton/x-gzip}} and there is no
{{Content-Encoding}} value.
For Safari, both sites give me a tar, so Safari may use some other method than
looking at the header to determine whether a file is a gzip tarball.
EDIT: See the top answer at
https://superuser.com/questions/940605/chromium-prevent-unpacking-tar-gz, it
seems like the "bug" is that the first mirror above sends back a
{{Content-Encoding}} value of {{x-gzip}}.
{quote}Your web server is likely sending the .tar.gz file with a
content-encoding: gzip header, causing the web browser to assume a gzip layer
was applied only to save bandwidth, and what you really intended to send was
the .tar archive. Chrome un-gzips it on the other side like it would with any
other file (.html, .js, .css, etc.) that it receives gzipped (it dutifully
doesn't modify the filename though).
To fix this, make sure your web server serves .tar.gz files without the
content-encoding: gzip header.
More Info: https://code.google.com/p/chromium/issues/detail?id=83292{quote}
was (Author: jbrock):
I think the inconsistent behavior in Chrome is due to different headers being
sent back from the mirrors:
{code:sh}
> curl -I
> http://apache.mirrors.pair.com/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
HTTP/1.1 200 OK
Date: Thu, 21 Dec 2017 23:46:58 GMT
Server: Apache/2.2.29
Last-Modified: Sat, 25 Nov 2017 02:44:26 GMT
ETag: "32b662-bfa03c4-55ec5a5c358a1"
Accept-Ranges: bytes
Content-Length: 200934340
Content-Type: application/x-tar
Content-Encoding: x-gzip
> curl -I
> http://apache.cs.utah.edu/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
HTTP/1.1 200 OK
Date: Thu, 21 Dec 2017 23:47:19 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Sat, 25 Nov 2017 02:44:26 GMT
ETag: "2ae630-bfa03c4-55ec5a5c0d680"
Accept-Ranges: bytes
Content-Length: 200934340
Content-Type: application/x-gzip
{code}
Note that for the first mirror above, {{Content-Type}} is
{{application/x-tar}}, and {{Content-Encoding}} is {{x-gzip}}. For the second
mirror above, {{Content-Type}} is {{applicaton/x-gzip}} and there is no
{{Content-Encoding}} value.
For Safari, both sites give me a tar, so Safari may use some other method than
looking at the header to determine whether a file is a gzip tarball.
EDIT: See the top answer at
https://superuser.com/questions/940605/chromium-prevent-unpacking-tar-gz, it
seems like the "bug" is that the first mirror above sends back a
{{Content-Encoding}} value of {{x-gzip}}.
{quote}Your web server is likely sending the .tar.gz file with a
content-encoding: gzip header, causing the web browser to assume a gzip layer
was applied only to save bandwidth, and what you really intended to send was
the .tar archive. Chrome un-gzips it on the other side like it would with any
other file (.html, .js, .css, etc.) that it receives gzipped (it dutifully
doesn't modify the filename though).
To fix this, make sure your web server serves .tar.gz files without the
content-encoding: gzip header.
More Info: https://code.google.com/p/chromium/issues/detail?id=83292{quote}
> Download mirror for spark-2.2.1-bin-hadoop2.7.tgz has file with incorrect
> checksum
> ----------------------------------------------------------------------------------
>
> Key: SPARK-22851
> URL: https://issues.apache.org/jira/browse/SPARK-22851
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.2.1
> Reporter: John Brock
> Priority: Critical
>
> The correct sha512 is:
> 349ee4bc95c760259c1c28aaae0d9db4146115b03d710fe57685e0d18c9f9538d0b90d9c28f4031ed45f69def5bd217a5bf77fd50f685d93eb207445787f2685.
> However, the file I downloaded from
> http://apache.mirrors.pair.com/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
> is giving me a different sha256:
> 039935ef9c4813eca15b29e7ddf91706844a52287999e8c5780f4361b736eb454110825224ae1b58cac9d686785ae0944a1c29e0b345532762752abab9b2cba9
> It looks like this mirror has a file that isn't actually gzipped, just
> tarred. If I ungzip one of the copies of spark-2.2.1-bin-hadoop2.7.tgz with
> the correct sha512, and take the sha512 of the resulting tar, I get the same
> incorrect hash above of
> 039935ef9c4813eca15b29e7ddf91706844a52287999e8c5780f4361b736eb454110825224ae1b58cac9d686785ae0944a1c29e0b345532762752abab9b2cba9.
> I asked some colleagues to download the incorrect file themselves to check
> the hash -- some of them got a file that was gzipped and some didn't. I'm
> assuming there's some caching or mirroring happening that may give you a
> different file than the one I got.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]