[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-08-26 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/546 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabl

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-07-29 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-50538277 Since we merged in https://github.com/apache/spark/pull/749 I believe it's okay to close this, right @pwendell ? --- If your project is set up for it, you can reply to thi

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-05-16 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-42456863 Just my 2 bits, but using getCanonicalPath() is usually a code smell; unless it's explicitly necessary (i.e. you know you have a symlink and you want to remove the actual f

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-05-10 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-42454571 @mridulm @srowen sorry not totally following this one. Is there an immediate isolated bug here, or is this a broader issue that we need to add consistency across the code

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-05-10 Thread advancedxy
Github user advancedxy commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-42525924 Sorry for being late for this conversation. I am busy with my own project. First, I agree with @srowen. SPARK-1527 was discovered when there were untracked file

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-05-10 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-42484142 @pwendell @nsuthar First, note SPARK-1527 (https://issues.apache.org/jira/browse/SPARK-1527) and its PR (https://github.com/apache/spark/pull/436). @adva

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-05-03 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-42110226 It is not about a few uses here or there - either spark codebase as a whole moves to a) canonical path always; or always sticks to b) paths relative to cwd and/or what is

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41512227 Just fly-by-commenting here, but the performance cost of doing a few directory traversals is not relevant in this code path. It gets called only once in the lifetime of a

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41503775 Agree and that leads to the changes. For example `HttpBroadcast.scala` adds paths to the set using the string from `getAbsolutePath` (line 173) but removes them with the va

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-27 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41503498 It goes back to the problem we are trying to solve. If the set/map can contain arbitrary paths then file.getCanonical is unavoidable. But then (multiple) IO and nat

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41497607 getAbsolutePath() would also resolve both issues as far as I can tell, and does not involve I/O. IIRC there was a real issue observed in DiskBlockManagerSuite.

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-27 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41497440 Are you actually seeing problems or is this a cleanup exercise to use appropriate api ? Creation of the file happens from within spark and is not externally provided -

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-27 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/546#discussion_r12029270 --- Diff: core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala --- @@ -229,7 +229,7 @@ private[spark] object HttpBroadcast extends Logging {

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-25 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41447132 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-25 Thread nsuthar
Github user nsuthar commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41416816 could someone pls verify this patch. Thanks. Niraj --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-24 Thread nsuthar
Github user nsuthar commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41362685 fixed it to get abs pathjira 1623 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-24 Thread nsuthar
Github user nsuthar commented on a diff in the pull request: https://github.com/apache/spark/pull/546#discussion_r11984024 --- Diff: core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala --- @@ -229,7 +229,7 @@ private[spark] object HttpBroadcast extends Logging {

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-24 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/546#discussion_r11983954 --- Diff: core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala --- @@ -229,7 +229,7 @@ private[spark] object HttpBroadcast extends Logging {

[GitHub] spark pull request: SPARK-1623. Broadcast cleaner should use getCa...

2014-04-24 Thread nsuthar
Github user nsuthar commented on the pull request: https://github.com/apache/spark/pull/546#issuecomment-41360736 OK, Sorry I misunderstood it. did it. On Thu, Apr 24, 2014 at 10:44 PM, Sean Owen wrote: > @nsuthar pretty sure he meant