GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/18799

    [SPARK-21596][SS]Ensure places calling HDFSMetadataLog.get check the return 
value

    ## What changes were proposed in this pull request?
    
    When I was investigating a flaky test, I realized that many places don't 
check the return value of `HDFSMetadataLog.get(batchId: Long): Option[T]`. When 
a batch is supposed to be there, the caller just ignores None rather than 
throwing an error. If some bug causes a query doesn't generate a batch metadata 
file, this behavior will hide it and allow the query continuing to run and 
finally delete metadata logs and make it hard to debug.
    
    This PR ensures that places calling HDFSMetadataLog.get always check the 
return value.
    
    ## How was this patch tested?
    
    Jenkins

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-21596

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18799.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18799
    
----
commit 7090fc064b9df22f6082f6de03b82a5bdfb29210
Author: Shixiong Zhu <[email protected]>
Date:   2017-08-01T18:26:02Z

    Ensure places calling HDFSMetadataLog.get check the return value

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to