GitHub user frreiss opened a pull request:

    https://github.com/apache/spark/pull/15027

    [SPARK-17475] [STREAMING] Delete CRC files if the filesystem doesn't use 
checksum files

    ## What changes were proposed in this pull request?
    
    When the metadata logs for various parts of Structured Streaming are stored 
on non-HDFS filesystems such as NFS or ext4, the HDFSMetadataLog class leaves 
hidden HDFS-style checksum (CRC) files in the log directory, one file per 
batch. This PR modifies HDFSMetadataLog so that it detects the use of a 
filesystem that doesn't use CRC files and removes the CRC files.
    
    ## How was this patch tested?
    
    Modified an existing test case in HDFSMetadataLogSuite to check whether 
HDFSMetadataLog correctly removes CRC files on the local POSIX filesystem.  Ran 
the entire regression suite.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/frreiss/spark-fred fred-17475

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15027
    
----
commit 3a2a9c116659f526189de6b8b98fb6c92024a7a6
Author: frreiss <[email protected]>
Date:   2016-09-09T17:30:08Z

    Delete CRC files when the filesystem doesn't support checksums.

commit 9ff89c0228c09764fa6444528050a35e823db0e6
Author: frreiss <[email protected]>
Date:   2016-09-09T17:30:52Z

    Merge branch 'master' of https://github.com/apache/spark into fred-17475

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to