[
https://issues.apache.org/jira/browse/SPARK-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-10949:
------------------------------
Affects Version/s: (was: 1.5.2)
(was: 1.6.0)
Target Version/s: (was: 1.5.0, 1.5.1, 1.5.2, 1.6.0)
[~aroberts] this can't affect versions like 1.5.2 / 1.6.0 since they don't
exist yet. Also please don't set Target Version until it's clear this will go
in. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
I suspect this is a good idea, so go ahead with a PR to see what tests say. The
worry is indeed just compatibility since various Hadoop versions will deploy
something different. I imagine it will be a net win.
> Upgrade Snappy Java to 1.1.2
> ----------------------------
>
> Key: SPARK-10949
> URL: https://issues.apache.org/jira/browse/SPARK-10949
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.5.0, 1.5.1
> Reporter: Adam Roberts
> Priority: Minor
>
> Snappy now supports concatenation of serialized streams, this patch contains
> a version number change and the "does not support" test is now a "supports"
> test.
> Note: I do have the pull request for this already created, tested to be OK on
> Intel 64 bit Linux, IBM Power 8 LE and Linux on IBM Z Systems (all with IBM
> Java 8). Also note that I was required to delete my m2 cache in order to
> resolve incompatible class exceptions when building.
> Snappy 1.1.2 changelog mentions:
> snappy-java-1.1.2 (22 September 2015)
> This is a backward compatible release for 1.1.x.
> Add AIX (32-bit) support.
> There is no upgrade for the native libraries of the other platforms.
> A major change since 1.1.1 is a support for reading concatenated results of
> SnappyOutputStream(s)
> snappy-java-1.1.2-RC2 (18 May 2015)
> Fix #107: SnappyOutputStream.close() is not idempotent
> snappy-java-1.1.2-RC1 (13 May 2015)
> SnappyInputStream now supports reading concatenated compressed results of
> SnappyOutputStream
> There has been no compressed format change since 1.0.5.x. So You can read the
> compressed results interchangeablly between these versions.
> Fixes a problem when java.io.tmpdir does not exist.
> From https://github.com/xerial/snappy-java/blob/develop/Milestone.md and up
> to date at the time of this pull request
> Also note https://github.com/xerial/snappy-java/issues/103
> "@xerial not sure how feasible or likely it is for this to happen, but it'd
> help tremendously Spark's performance because we are experimenting with a new
> shuffle path that uses channel.transferTo to avoid user space copying.
> However, for that to work, we'd need the underlying format to support
> concatenation. As far we know, LZF has this property, and Snappy might also
> have it (but snappy-java implementation doesn't support it)."
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]