[jira] [Comment Edited] (HADOOP-13649) s3guard: implement time-based (TTL) expiry for LocalMetadataStore

2018-05-07 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466798#comment-16466798
 ] 

Aaron Fabbri edited comment on HADOOP-13649 at 5/8/18 3:07 AM:
---

Nice work [~gabor.bota], this looks good. I hope you don't mind, I've attached 
a v3 of the patch with a couple tweaks:

- Fix the checkstyle issue.
- TTL is optional (if ttl config is zero, it is disabled)
- TTL units seconds -> milliseconds (in case anyone wants to play with very 
short TTLs, which i think are interesting)
- Minor test changes to make sure they still work with zero TTL
- Added "Evolving" API annotations to the "undocumented" configs used for 
LocalMetadataStore.

Please review and give a "+1 (nonbinding)" if you like these changes.  I tested 
in US West 2.


was (Author: fabbri):
Nice work [~gabor.bota], this looks good. I hope you don't mind, I've attached 
a v3 of the patch with a couple tweaks:

- Fix the checkstyle issue.
- TTL is optional (if ttl config is zero, it is disabled)
- TTL units seconds -> milliseconds (in case anyone wants to play with very 
short TTLs, which i think are interesting)
- Minor test changes to make sure they still work with zero TTL

Please review and give a "+1 (nonbinding)" if you like these changes.  I tested 
in US West 2.

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -
>
> Key: HADOOP-13649
> URL: https://issues.apache.org/jira/browse/HADOOP-13649
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Aaron Fabbri
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-13649.001.patch, HADOOP-13649.002.patch, 
> HADOOP-13649.003.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing.  It 
> may be useful in narrow circumstances where the workload can tolerate 
> short-term lack of inter-node consistency:  Being in-memory, one JVM/node's 
> LocalMetadataStore will not see another node's changes to the underlying 
> filesystem.
> To put a bound on the time during which this inconsistency may occur, we 
> should implement time-based (a.k.a. Time To Live / TTL)  expiration for 
> LocalMetadataStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13649) s3guard: implement time-based (TTL) expiry for LocalMetadataStore

2018-05-07 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465802#comment-16465802
 ] 

Gabor Bota edited comment on HADOOP-13649 at 5/7/18 11:23 AM:
--

Mvn test and verify were successful on eu-west-1 with 
fs.s3a.s3guard.test.enabled (_-Ds3guard)._


was (Author: gabor.bota):
Mvn test and verify were successful on eu-west-1 with 
fs.s3a.s3guard.test.enabled _-Ds3guard._

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -
>
> Key: HADOOP-13649
> URL: https://issues.apache.org/jira/browse/HADOOP-13649
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Aaron Fabbri
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-13649.001.patch, HADOOP-13649.002.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing.  It 
> may be useful in narrow circumstances where the workload can tolerate 
> short-term lack of inter-node consistency:  Being in-memory, one JVM/node's 
> LocalMetadataStore will not see another node's changes to the underlying 
> filesystem.
> To put a bound on the time during which this inconsistency may occur, we 
> should implement time-based (a.k.a. Time To Live / TTL)  expiration for 
> LocalMetadataStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13649) s3guard: implement time-based (TTL) expiry for LocalMetadataStore

2018-05-07 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465636#comment-16465636
 ] 

Gabor Bota edited comment on HADOOP-13649 at 5/7/18 8:53 AM:
-

Thanks for the review.
 # I've created HADOOP-15423 to merge the two caches into one.
 # .expireAfterWrite() vs .expireAfterAccess()
 ** I think that access could be better in this situation, as long as there's no
modification in the underlying bucket from another client - so no one else is 
modifying the s3 bucket like deleting files while the cache is in use - that 
way we can
say that the cache is up to date.
 ** This store is only used for testing right now, so I can say that's right to 
choose expireAfterAccess.
 # Locking
 ** The com.google.common.cache.LocalCache has locking for write (e.g put, 
replace, remove) but not for simple read (getIfPresent).
 ** LocalMetadataStore has a lock for read too: synchronized (this) in get().
 ** As the merge of the two caches will happen in HADOOP-15423, I think that's 
a topic to discuss further on that issue.
 # Performance testing
 ** I've done some performance testing to compare the cache vs hash performance.
 ** I hope that used sane parameters during the tests.
 ** Based on this, there will be some performance decrease with this 
implementation, but nothing significant with the basic test settings - in my 
tests I've modified (increased) the settings a little. Move() performance 
should improve when merging the caches - it will be interesting to compare 
what's happening after that change.
 ** Test results are in the following gist: 
[https://gist.github.com/bgaborg/2220fd53e553ec971c8edd1adf2493cd] 


was (Author: gabor.bota):
Thanks for the review.
 # I've created HADOOP-15423 to merge the two caches into one.
 # .expireAfterWrite() vs .expireAfterAccess()
 ** I think that access could be better in this situation, as long as there's no
modification in the underlying bucket from another client - so no one else is 
modifying the s3 bucket like deleting files while the cache is in use - that 
way we can
say that the cache is up to date.
 ** This store is only used for testing right now, so I can say that's right to 
choose expireAfterAccess.
 # Locking
 ** The com.google.common.cache.LocalCache has locking for write (e.g put, 
replace, remove) but not for simple read (getIfPresent).
 ** LocalMetadataStore has a lock for read too: synchronized (this) in get().
 ** As the merge of the two caches will happen in HADOOP-15423, I think that's 
a topic to discuss further on that issue.
 # Performance testing
 ** I've done some performance testing to compare the cache vs hash performance.
 ** I hope that used sane parameters during the tests.
 ** Based on this, there will be some performance decrease with this 
implementation, but nothing significant with the basic test settings - in my 
tests I've modified the settings a little bit. Move() performance should 
improve when merging the caches - it will be interesting to compare what's 
happening after that change.
 ** Test results are in the following gist: 
[https://gist.github.com/bgaborg/2220fd53e553ec971c8edd1adf2493cd] 

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -
>
> Key: HADOOP-13649
> URL: https://issues.apache.org/jira/browse/HADOOP-13649
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Aaron Fabbri
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-13649.001.patch, HADOOP-13649.002.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing.  It 
> may be useful in narrow circumstances where the workload can tolerate 
> short-term lack of inter-node consistency:  Being in-memory, one JVM/node's 
> LocalMetadataStore will not see another node's changes to the underlying 
> filesystem.
> To put a bound on the time during which this inconsistency may occur, we 
> should implement time-based (a.k.a. Time To Live / TTL)  expiration for 
> LocalMetadataStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13649) s3guard: implement time-based (TTL) expiry for LocalMetadataStore

2018-05-07 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465636#comment-16465636
 ] 

Gabor Bota edited comment on HADOOP-13649 at 5/7/18 8:43 AM:
-

Thanks for the review.
 # I've created HADOOP-15423 to merge the two caches into one.
 # .expireAfterWrite() vs .expireAfterAccess()
 ** I think that access could be better in this situation, as long as there's no
modification in the underlying bucket from another client - so no one else is 
modifying the s3 bucket like deleting files while the cache is in use - that 
way we can
say that the cache is up to date.
 ** This store is only used for testing right now, so I can say that's right to 
choose expireAfterAccess.
 # Locking
 ** The com.google.common.cache.LocalCache has locking for write (e.g put, 
replace, remove) but not for simple read (getIfPresent).
 ** LocalMetadataStore has a lock for read too: synchronized (this) in get().
 ** As the merge of the two caches will happen in HADOOP-15423, I think that's 
a topic to discuss further on that issue.
 # Performance testing
 ** I've done some performance testing to compare the cache vs hash performance.
 ** I hope that used sane parameters during the tests.
 ** Based on this, there will be some performance decrease with this 
implementation, but nothing significant with the basic test settings - in my 
tests I've modified the settings a little bit. Move() performance should 
improve when merging the caches - it will be interesting to compare what's 
happening after that change.
 ** Test results are in the following gist: 
[https://gist.github.com/bgaborg/2220fd53e553ec971c8edd1adf2493cd] 


was (Author: gabor.bota):
Thanks for the review.
 # I've created HADOOP-15423 to merge the two caches into one.
 # .expireAfterWrite() vs .expireAfterAccess()
 ** I think that access could be better in this situation, as long as there's no
modification in the underlying bucket from another client - so no one else is 
modifying the s3
bucket like deleting files while the cache is in use - that way we can
say that the cache is up to date.
This store is only used for testing right now, so I can say that's right to 
choose expireAfterAccess.
 # Locking
 ** The com.google.common.cache.LocalCache has locking for write (e.g put, 
replace, remove) but not for simple read (getIfPresent).
 ** LocalMetadataStore has a lock for read too: synchronized (this) in get().
 ** As the merge of the two caches will happen in HADOOP-15423, I think that's 
a topic to discuss further on that issue.
 # Performance testing
 ** I've done some performance testing to compare the cache vs hash performance.
 ** I hope that used sane parameters during the tests.
 ** Based on this, there will be some performance decrease with this 
implementation, but nothing significant with the basic test settings - in my 
tests I've modified the settings a little bit. Move() performance should 
improve when merging the caches - it will be interesting to compare what's 
happening after that change.
 ** Test results are in the following gist: 
[https://gist.github.com/bgaborg/2220fd53e553ec971c8edd1adf2493cd] 

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -
>
> Key: HADOOP-13649
> URL: https://issues.apache.org/jira/browse/HADOOP-13649
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Aaron Fabbri
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-13649.001.patch, HADOOP-13649.002.patch
>
>
> LocalMetadataStore is primarily a reference implementation for testing.  It 
> may be useful in narrow circumstances where the workload can tolerate 
> short-term lack of inter-node consistency:  Being in-memory, one JVM/node's 
> LocalMetadataStore will not see another node's changes to the underlying 
> filesystem.
> To put a bound on the time during which this inconsistency may occur, we 
> should implement time-based (a.k.a. Time To Live / TTL)  expiration for 
> LocalMetadataStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13649) s3guard: implement time-based (TTL) expiry for LocalMetadataStore

2018-04-25 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452447#comment-16452447
 ] 

Gabor Bota edited comment on HADOOP-13649 at 4/25/18 3:17 PM:
--

Comments on my 001 patch:
1. Removed the LruHasMap implementation
 - Using the com.google.common.cache.Cache
 - All tests pass
 - Loosing the mruGet method. I was not able to find any test for this 
particular feature, or 'real world' usage besides that it will modify the 
LinkedList inside the LinkedHashMap, so it will have the most recently modified 
elements in the start of the List. I don't think that would cause big 
performance issue in testing, but I've made a little benchmark to test this.[1]
 - Using basically the same implementation in LocalMetadataStore, the new 
addition is only the Timed Eviction.
 - Of course I had to rename every hash to cache, and there were some other 
minor changes.

2. Created a test for the Timed Eviction (testCacheTimedEvictionAfterWrite)
 - Only in the TestLocalMetadataStore (not in the abstract test)
 - Just tests if the cache implementation inside LocalMetadataStore working as 
expected, nothing more.
 - Maybe additional tests could be added for higher level functionality, but 
the problem is that there will be a need from the test to change the default 
cache inside the instance to another cache build with the support for a custom 
Ticker to avoid realtime waiting for the eviction (which could cause flakyness).

3. Should the removal of the evicted elements be logged?
 - There is an option to log removed elements. It can be done synchronously 
(default), and asynchronously as well. Using the synchronous way could be 
expensive, and I'm not even sure if we want this (maybe good for debug) so I 
haven't included it in the initial patch.

4. Maybe DEFAULT_EXPIRY_AFTER_WRITE_SECONDS should be 
DEFAULT_EVICTION_AFTER_WRITE_SECONDS

[1] The impact of having cache instead of LruHashMap in LocalMetadataStore on 
tests:
 - mvn -Dparallel-tests -DtestsThreadCount=8 clean verify
with patch: 4m10.917s
without patch: 4m12.413s
 - mvn -Dparallel-tests -DtestsThreadCount=8 clean test
with patch: 4m6.383s
without patch: 4m5.217s

Test and verify runs on us-west-2 succesfully for the patch.


was (Author: gabor.bota):
Comments on my 001 patch:
1. Removed the LruHasMap implementation
 - Using the com.google.common.cache.Cache
 - All tests pass
 - Loosing the mruGet method. I was not able to find any test for this 
particular feature,
or 'real world' usage besides that it will modify the LinkedList inside
the LinkedHashMap, so it will have the most recently modified elements
in the start of the List. I don't think that would cause big performance
issue in testing, but I've made a little benchmark to test this.[1]
 - Using basically the same implementation in LocalMetadataStore, the
new addition is only the Timed Eviction.
 - Of course I had to rename every hash to cache, and there were some other 
minor changes.

2. Created a test for the Timed Eviction (testCacheTimedEvictionAfterWrite)
 - Only in the TestLocalMetadataStore (not in the abstract test)
 - Just tests if the cache implementation inside LocalMetadataStore working
as expected, nothing more.
 - Maybe additional tests could be added for higher level functionality, but the
problem is that there will be a need from the test to change the default
cache inside the instance to another cache build with the support for a custom 
Ticker
to avoid realtime waiting for the eviction (which could cause flakyness).

3. Should the removal of the evicted elements be logged?
 - There is an option to log removed elements. It can be done synchronously 
(default),
and asynchronously as well. Using the synchronous way could be expensive, and 
I'm
not even sure if we want this (maybe good for debug) so I haven't included it 
in the
initial patch.

4. Maybe DEFAULT_EXPIRY_AFTER_WRITE_SECONDS should be 
DEFAULT_EVICTION_AFTER_WRITE_SECONDS

[1] The impact of having cache instead of LruHashMap in LocalMetadataStore on 
tests:
 - mvn -Dparallel-tests -DtestsThreadCount=8 clean verify
with patch: 4m10.917s
without patch: 4m12.413s
 - mvn -Dparallel-tests -DtestsThreadCount=8 clean test
with patch: 4m6.383s
without patch: 4m5.217s

Test and verify runs on us-west-2 succesfully for the patch.

> s3guard: implement time-based (TTL) expiry for LocalMetadataStore
> -
>
> Key: HADOOP-13649
> URL: https://issues.apache.org/jira/browse/HADOOP-13649
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Aaron Fabbri
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-13649.001.patch
>
>
> LocalMetadataStore is primarily a reference