GitHub user ksakellis opened a pull request:

    https://github.com/apache/spark/pull/3119

    [SPARK-4079] [CORE] Default to LZF if Snappy not available

    By default, snappy is the compression codec used. If Snappy is not 
available, Spark currently throws a stack trace. Now Spark falls back to LZF if 
Snappy is not available on the cluster and logs a warning message.
    
    The only exception is if the user has explicitly set 
spark.io.compression.codec=snappy. In this case, if snappy is not available, an 
IllegalArgumentException is thrown.
    
    Because of the way the Snappy library uses static initialization, it was 
very difficult in a unit test to simulate Snappy not being available. The only 
way I could think of was to create multiple classloaders which seemed 
excessive. As a result, most of this was tested adhoc on a test cluster by 
modifying the system property: org.xerial.snappy.use.systemlib=true which 
caused Snappy to not load and thus triggering this logic.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ksakellis/spark kostas-spark-4079

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3119.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3119
    
----
commit c8bc9db38f461ed4652ebf1ec70ef967b0d8f040
Author: Kostas Sakellis <[email protected]>
Date:   2014-11-05T02:26:12Z

    [SPARK-4079] [CORE] Default to LZF if Snappy not available
    
    By default, snappy is the compression codec used.
    If Snappy is not available, Spark currently throws
    a stack trace. Now Spark falls back to LZF
    if Snappy is not available on the cluster and logs
    a warning message.
    
    The only exception is if the user has explicitly
    set spark.io.compression.codec=snappy. In this
    case, if snappy is not available, an
    IllegalArgumentException is thrown.
    
    Because of the way the Snappy library uses static
    initialization, it was very difficult in a unit test to
    simulate Snappy not being available. The only way I
    could think of was to create multiple classloaders
    which seemed excessive. As a result, most of this was tested
    adhoc on a test cluster by modifying the system property:
    org.xerial.snappy.use.systemlib=true which caused Snappy
    to not load and thus triggering this logic.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to