[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-09 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15539
  
Just submitted a PR for `set location`: 
https://github.com/apache/spark/pull/16514 That issue is caused by the cache 
for mapping the table name to LogicalRelation. We need to refresh it after 
changing the table location.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-09 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/15539
  
Hmm, I don't think fileStatusCache can ever return incorrect results, only 
stale ones. Furthermore, its scoped by client-id to particular instances of 
tables, so refresh table is guaranteed to wipe out cache state.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15539
  
It could return incorrect results, but I need to prove it using a use case. 
We always call 
[refresh0](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala#L50)
 in 
[`filterPartitions`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CatalogFileIndex.scala#L78-L79).
 Thus, it covers most cases. Let me try more in the next few days. 

Tonight, let me first fix the cache inconsistence issue caused by `SET 
LOCATION`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/15539
  
Hm, what use cases are we trying to address? As I understand, the worst 
that can happen if the cache size flag is toggled at runtime is that the old 
settings might still apply. And when the filesource partition management flag 
is changed, it only takes effect after the table entry in 
`cachedDataSourceTables` in HiveMetastoreCatalog is refreshed.

If we document the above, is that enough?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15539
  
Yeah, I think we should document the behavior issues when different 
sessions are using different conf values. Will do it. I think we also need to 
evict all the cache that are associated with the specific client when users 
make a switch. Do you think this makes sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15539
  
`(ClientId, Path), Array[FileStatus]` uh... `FileStatusCache` does not 
share any entries with any other client, but does share memory resources for 
the purpose of cache eviction.

Sorry, I misunderstood the code. I think we need to evict all the cache 
that are associated with the specific client when users make a switch. Do you 
think this makes sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2017-01-08 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/15539
  
That one is safe to make global but mutable right? It will take effect
after a table is refreshed.

Most of these anomalies seem OK to me provided we document them -- it seems
to solve them we would have to cache per session which would significantly
increase memory required.

On Sun, Jan 8, 2017, 9:32 AM Xiao Li  wrote:

> *@gatorsmile* commented on this pull request.
> --
>
> In
> 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileStatusCache.scala
> :
>
> > +   * Invalidates all data held by this cache.
> +   */
> +  def invalidateAll(): Unit
> +}
> +
> +object FileStatusCache {
> +  private var sharedCache: SharedInMemoryCache = null
> +
> +  /**
> +   * @return a new FileStatusCache based on session configuration. Cache 
memory quota is
> +   * shared across all clients.
> +   */
> +  def newCache(session: SparkSession): FileStatusCache = {
> +synchronized {
> +  if (session.sqlContext.conf.filesourcePartitionPruning &&
> +  session.sqlContext.conf.filesourcePartitionFileCacheSize > 0) {
>
> How about HIVE_MANAGE_FILESOURCE_PARTITIONS?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or mute the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-24 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/15539
  
@ericl Great work on this. I don't know how I got an author credit in the 
commit...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-22 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15539
  
merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15539
  
I had some network problems, I'll ask @yhuai to merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15539
  
LGTM, next maybe we can refactor the `PartitionAwareFileCatalog` and make 
it use the new global cache better. I'm going to merge it to unblock other 
works. thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67346/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67346 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67346/consoleFull)**
 for PR 15539 at commit 
[`2a96537`](https://github.com/apache/spark/commit/2a965377258d4d77db5a3f00d4257bbacc4a0adb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67346 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67346/consoleFull)**
 for PR 15539 at commit 
[`2a96537`](https://github.com/apache/spark/commit/2a965377258d4d77db5a3f00d4257bbacc4a0adb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/15539
  
> The biggest problem of this proposal is, invalidating the cache may be 
slow if there are a lot of cache entries.

I don't think this is really an issue. Conservatively assuming ~1us per 
eviction and one file per partition, we can evict all 250MB of file entries in 
~160ms.

I also cleaned up the code a little to better hide the ids from cache users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15539
  
We have REFRESH TABLE/PATH because we cache things, so I think we should 
consider caching and refreshing together. Currently we have 4 caches:

1. **table name to `LogicalRelation` cache**: It's kept in 
`HiveMetastoreCatalog`, we won't touch it in this PR.
2. **table name to partition directories**: It's kept in 
`ListingFileCatalog` and is data source table only, as hive table persists this 
information in metastore. #15515 is going to follow hive and remove this cache 
for data source table.
3. **partition directory to file status**: It's also kept in 
`ListingFileCatalog` and is data source table only. This PR is going to add it 
to data source tables that are converted from hive tables, and make it global.
4. **table relation to `InMemoryRelation`**: It's kept in `CacheManager`, 
and will only be cached when users call `CACHE TABLE`. We won't touch it in 
this PR.

The current refresh logic:

1. **REFRESH TABLE**: it invalidates all 4 caches, by removing the first 
cache, as `FileCatalog` is a member of `HadoopFsRelation` and 
`HadoopFsRelation` is a member of `LogicalRelation`. Then it asks `CacheManage` 
to invalidate the `InMemoryRelation` cache if it's cached.
2. **REFRESH PATH**: it finds a table that uses the given path as table 
path or partition paths, and then refresh this table. Actually I don't think 
this is the right behaviour, REFRESH PATH should be more fine-grained, and only 
remove this path from the third cache.

However, if we make the third cache global, we need a little more work to 
make REFRESH TABLE work, because just removing the first cache is not enough. 
We need a way to know which partition directories are cached for the given 
table.

Proposal 1(the current implementation of this PR): In the global "partition 
directory to file status" cache, we use `clientId -> partition dir` as key 
instead of `partition dir`. When we refresh a table, we go through the whole 
map and find cache entries match the `clientId` and remove them. Note that, 
every `TableFileCatalog` has a unique `clientId`.

The biggest problem of this proposal is, invalidating the cache may be slow 
if there are a lot of cache entries.

Proposal 2: we just use `partition dir` as the cache entry key in the 
global cache, and keep the list of cached partition dirs in `TableFileCatalog`. 
When refreshing a table, we get the list of cached partition dir first, and 
then invalidete them one by one in the "partition directory to file status" 
cache.

The biggest problem of this proposal is, if 2 tables share some paths(it's 
possible if we use same paths to add parittions to 2 tables), refreshing one 
table will partially refresh the other table. What is worse, if the other table 
is explicitly cached via CACHE TABLE, we won't invalidate the 
`InMemoryRelation` cache and brings inconsistency.

Proposal 3: do not make the "partition directory to file status" cache 
global in this PR, and think about it later. At least it's not worse than spark 
2.0

Any ideas?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67326/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67326 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67326/consoleFull)**
 for PR 15539 at commit 
[`262f6ee`](https://github.com/apache/spark/commit/262f6eed6de2a290c7b2ce6b0efefb05d0754629).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67326 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67326/consoleFull)**
 for PR 15539 at commit 
[`262f6ee`](https://github.com/apache/spark/commit/262f6eed6de2a290c7b2ce6b0efefb05d0754629).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67292/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67292 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67292/consoleFull)**
 for PR 15539 at commit 
[`57452a9`](https://github.com/apache/spark/commit/57452a9e0049157b2d80876e6fb668ffd65a6966).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67291/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67291 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67291/consoleFull)**
 for PR 15539 at commit 
[`9ada9b5`](https://github.com/apache/spark/commit/9ada9b50790ad808e208c6dc64b5f00a9990611a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67292 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67292/consoleFull)**
 for PR 15539 at commit 
[`57452a9`](https://github.com/apache/spark/commit/57452a9e0049157b2d80876e6fb668ffd65a6966).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67281/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67281 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67281/consoleFull)**
 for PR 15539 at commit 
[`4647f1f`](https://github.com/apache/spark/commit/4647f1fd5c2283ed543f07d212478965bfa5b88d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67291 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67291/consoleFull)**
 for PR 15539 at commit 
[`9ada9b5`](https://github.com/apache/spark/commit/9ada9b50790ad808e208c6dc64b5f00a9990611a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67281 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67281/consoleFull)**
 for PR 15539 at commit 
[`4647f1f`](https://github.com/apache/spark/commit/4647f1fd5c2283ed543f07d212478965bfa5b88d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67222/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67222 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67222/consoleFull)**
 for PR 15539 at commit 
[`35b565b`](https://github.com/apache/spark/commit/35b565bf7aa114068fc11fd50a0c51574be0cbe8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67217/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67217 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67217/consoleFull)**
 for PR 15539 at commit 
[`0b54b4c`](https://github.com/apache/spark/commit/0b54b4ca61309779e4de661216d15bbbe08e76be).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class InMemoryCache(maxSizeInBytes: Long) extends FileStatusCache with 
Logging `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67215/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67215 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67215/consoleFull)**
 for PR 15539 at commit 
[`f318aa7`](https://github.com/apache/spark/commit/f318aa73f85d8141863a3138a0b7c21e56ffa1e5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67222 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67222/consoleFull)**
 for PR 15539 at commit 
[`35b565b`](https://github.com/apache/spark/commit/35b565bf7aa114068fc11fd50a0c51574be0cbe8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67217 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67217/consoleFull)**
 for PR 15539 at commit 
[`0b54b4c`](https://github.com/apache/spark/commit/0b54b4ca61309779e4de661216d15bbbe08e76be).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67215 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67215/consoleFull)**
 for PR 15539 at commit 
[`f318aa7`](https://github.com/apache/spark/commit/f318aa73f85d8141863a3138a0b7c21e56ffa1e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67211/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67211 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67211/consoleFull)**
 for PR 15539 at commit 
[`f63ce3c`](https://github.com/apache/spark/commit/f63ce3cc13576a23e115b68a5caff4436554f765).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67211 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67211/consoleFull)**
 for PR 15539 at commit 
[`f63ce3c`](https://github.com/apache/spark/commit/f63ce3cc13576a23e115b68a5caff4436554f765).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67210/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67210 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67210/consoleFull)**
 for PR 15539 at commit 
[`3fba74c`](https://github.com/apache/spark/commit/3fba74cf0413ae9cc3d2117d49cd14bb42edd549).
 * This patch **fails Scala style tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/15539
  
@mallman those numbers seem about right. I think as long as planning time 
is not that much worse than with the old ListingFileCatalog we are good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67210 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67210/consoleFull)**
 for PR 15539 at commit 
[`3fba74c`](https://github.com/apache/spark/commit/3fba74cf0413ae9cc3d2117d49cd14bb42edd549).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67160/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67160 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67160/consoleFull)**
 for PR 15539 at commit 
[`b9272c2`](https://github.com/apache/spark/commit/b9272c2a51ab981a6343a43931f57d03e931aab1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/15539
  
@ericl I took this PR for a test drive with some large-ish tables. 
Everything appeared to work as expected.

As far as performance goes, planning a simple select on a partitioned table 
(with cached partition metadata) took less than 1 second up to about 3k 
partitions. It was about 1.5 seconds with 5k partitions, 2 seconds with about 
7.5k partitions and about 3.2 seconds with 10k partitions. Does that align with 
your findings?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67152/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15539
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67152 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67152/consoleFull)**
 for PR 15539 at commit 
[`947c32d`](https://github.com/apache/spark/commit/947c32d1658948ed4322ab1c3413e92648432ab9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class FileStatusCache `
  * `class InMemoryCache extends FileStatusCache `
  * `class NoopCache extends FileStatusCache `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67160 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67160/consoleFull)**
 for PR 15539 at commit 
[`b9272c2`](https://github.com/apache/spark/commit/b9272c2a51ab981a6343a43931f57d03e931aab1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...

2016-10-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15539
  
**[Test build #67152 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67152/consoleFull)**
 for PR 15539 at commit 
[`947c32d`](https://github.com/apache/spark/commit/947c32d1658948ed4322ab1c3413e92648432ab9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org