Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15539
Just submitted a PR for `set location`:
https://github.com/apache/spark/pull/16514 That issue is caused by the cache
for mapping the table name to LogicalRelation. We need to refresh it after
Github user ericl commented on the issue:
https://github.com/apache/spark/pull/15539
Hmm, I don't think fileStatusCache can ever return incorrect results, only
stale ones. Furthermore, its scoped by client-id to particular instances of
tables, so refresh table is guaranteed to wipe
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15539
It could return incorrect results, but I need to prove it using a use case.
We always call
Github user ericl commented on the issue:
https://github.com/apache/spark/pull/15539
Hm, what use cases are we trying to address? As I understand, the worst
that can happen if the cache size flag is toggled at runtime is that the old
settings might still apply. And when the
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15539
Yeah, I think we should document the behavior issues when different
sessions are using different conf values. Will do it. I think we also need to
evict all the cache that are associated with the
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15539
`(ClientId, Path), Array[FileStatus]` uh... `FileStatusCache` does not
share any entries with any other client, but does share memory resources for
the purpose of cache eviction.
Sorry,
Github user ericl commented on the issue:
https://github.com/apache/spark/pull/15539
That one is safe to make global but mutable right? It will take effect
after a table is refreshed.
Most of these anomalies seem OK to me provided we document them -- it seems
to solve
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15539
@ericl Great work on this. I don't know how I got an author credit in the
commit...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/15539
merging to master!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/15539
I had some network problems, I'll ask @yhuai to merge it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/15539
LGTM, next maybe we can refactor the `PartitionAwareFileCatalog` and make
it use the new global cache better. I'm going to merge it to unblock other
works. thanks!
---
If your project is set up
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67346/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67346 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67346/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67346 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67346/consoleFull)**
for PR 15539 at commit
Github user ericl commented on the issue:
https://github.com/apache/spark/pull/15539
> The biggest problem of this proposal is, invalidating the cache may be
slow if there are a lot of cache entries.
I don't think this is really an issue. Conservatively assuming ~1us per
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/15539
We have REFRESH TABLE/PATH because we cache things, so I think we should
consider caching and refreshing together. Currently we have 4 caches:
1. **table name to `LogicalRelation`
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67326/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67326 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67326/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67326 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67326/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67292/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67292 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67292/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67291/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67291 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67291/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67292 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67292/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67281/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67281 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67281/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67291 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67291/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67281 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67281/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67222/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67222 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67222/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67217/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67217 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67217/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67215/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67215 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67215/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67222 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67222/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67217 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67217/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67215 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67215/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67211/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67211 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67211/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67211 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67211/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67210/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67210 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67210/consoleFull)**
for PR 15539 at commit
Github user ericl commented on the issue:
https://github.com/apache/spark/pull/15539
@mallman those numbers seem about right. I think as long as planning time
is not that much worse than with the old ListingFileCatalog we are good.
---
If your project is set up for it, you can reply
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67210 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67210/consoleFull)**
for PR 15539 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67160/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67160 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67160/consoleFull)**
for PR 15539 at commit
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15539
@ericl I took this PR for a test drive with some large-ish tables.
Everything appeared to work as expected.
As far as performance goes, planning a simple select on a partitioned table
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67152/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15539
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67152 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67152/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67160 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67160/consoleFull)**
for PR 15539 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15539
**[Test build #67152 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67152/consoleFull)**
for PR 15539 at commit
62 matches
Mail list logo