[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

srowen Sun, 27 Apr 2014 07:13:51 -0700

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/546#issuecomment-41497607
  
    getAbsolutePath() would also resolve both issues as far as I can tell,
    and does not involve I/O. IIRC there was a real issue observed in
    DiskBlockManagerSuite.
    
    For HttpBroadcast, see how the file names are retrieved with
    getAbsolutePath(), so this is a real issue albeit probably theoretical
    right now.  (That one's also resolved by just using a Set of Files.)
    
    (Is the I/O going to matter though? These are executed once.)
    
    On Sun, Apr 27, 2014 at 3:06 PM, Mridul Muralidharan
    <[email protected]> wrote:
    > Are you actually seeing problems or is this a cleanup exercise to use
    > appropriate api ?
    > Creation of the file happens from within spark and is not externally
    > provided - and that should match how it gets used.
    >
    > If we are not seeing any actual issues, I would rather not go down the 
path
    > of fixing this.
    > getCanonicalPath, etc are expensive operations reqiuring filesystem IO to
    > resolve.
    >
    > â
    > Reply to this email directly or view it on GitHub.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

Reply via email to