Github user ehiggs closed the pull request at:
https://github.com/apache/spark/pull/4204
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user ehiggs commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-119210146
Closed as it's not the correct approach.
Thanks @srowen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-96769755
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/4204#discussion_r23694090
--- Diff:
core/src/main/scala/org/apache/spark/storage/LocalFileSystem.scala ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user ehiggs commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71740260
Thanks for your feedback.
So the `FileInputFormat` is responsible for sorting the file pieces. I
think this means any file format that one expects `sortByKey` to
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71741379
Yes, or override `listStatus` I think. I suppose that also has the problem
of not being universal. I wonder how often it's necessary to ensure that
partitions are
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71440194
Ah right it's only `Comparable`. Hm, but this require users to set a new
option to get the desired behavior, and doesn't 'fix' behavior for any other
`FileSystem`. You
Github user ehiggs commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71438376
The recommendation on the mailing list was to provide a `FileSystem` that
could be used to from the config `spark.hadoop.fs.file.impl`.
`Path` doesn't have an
GitHub user ehiggs opened a pull request:
https://github.com/apache/spark/pull/4204
SPARK-5300 Add LocalFileSystem which will return file parts in the corre...
...ct order.
We override listLocatedStatus and slurp up the iterator from the Hadoop
LocalFileSystem version,
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71432775
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71432801
This isn't used anywhere. Why not sort paths lexicographically in the
calling code rather than make a new `FileSystem`?
---
If your project is set up for it, you can
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4204#discussion_r23518701
--- Diff:
core/src/main/scala/org/apache/spark/storage/LocalFileSystem.scala ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user ehiggs commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71448742
Well, I could submit a patch to hadoop-common to do the sort in
`o.a.h.fs.FileSystem`, and that would fix it basically everywhere. However, I
had assumed that hadoop
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/4204#issuecomment-71449808
That's not what I mean, since it still wouldn't address any custom or
third-party `FileSystem` implementations outside Hadoop. I'm talking about
something much simpler:
14 matches
Mail list logo