subject:"\[jira\] \[Commented\] \(HIVE\-10503\) Aggregate stats cache\: follow up optimizations"

[jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations

2015-04-29 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520670#comment-14520670
 ] 

Vaibhav Gumashta commented on HIVE-10503:
-

cc [~alangates]

 Aggregate stats cache: follow up optimizations
 --

 Key: HIVE-10503
 URL: https://issues.apache.org/jira/browse/HIVE-10503
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.2.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 1.3.0


 Some follow up work items:
 1. Estimate cache nodes from memory size - currently the user needs to 
 specify size based on #nodes.
 2. Make the AggregateStatsCache#add method asynchronous - adding to cache can 
 happen in a new thread.
 3. Based on perf testing, explore an alternate data structure for the node 
 list per cache key.
 4. Explore ideas to reduce locking granularity of the value list per cache 
 key.
 5. There is an O(n*n) loop while finding the match - that should go away.
 6. Single call to DB to get aggregate for columns not in cache.
 7. Organize metrics capturing in a better way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations

2015-04-28 Thread Thejas M Nair (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517885#comment-14517885
]

Thejas M Nair commented on HIVE-10503:
--

Regarding the eviction policy - (AggregateStatsCache.evictOneNode)

Evicting one LRU node at a time is expensive.
I think we should just reduce the TTL to 0.9TTL , 0.8TTL etc and call this
function again.
Ideally, in the long term, we should think of using both the frequency of use
and cost of re-computing the stats while deciding which ones to evict.

Aggregate stats cache: follow up optimizations
--

Key: HIVE-10503
URL: https://issues.apache.org/jira/browse/HIVE-10503
Project: Hive
Issue Type: Improvement
Components: Metastore
Affects Versions: 1.2.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Fix For: 1.3.0

Some follow up work items:
1. Estimate cache nodes from memory size - currently the user needs to
specify size based on #nodes.
2. Make the AggregateStatsCache#add method asynchronous - adding to cache can
happen in a new thread.
3. Based on perf testing, explore an alternate data structure for the node
list per cache key.
4. Explore ideas to reduce locking granularity of the value list per cache
key.
5. There is an O(n*n) loop while finding the match - that should go away.
6. Single call to DB to get aggregate for columns not in cache.
7. Organize metrics capturing in a better way.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations

2015-04-27 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514724#comment-14514724
 ] 

Vaibhav Gumashta commented on HIVE-10503:
-

cc [~thejas] [~mmokhtar]. I'll start working on this in few days.

 Aggregate stats cache: follow up optimizations
 --

 Key: HIVE-10503
 URL: https://issues.apache.org/jira/browse/HIVE-10503
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.2.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 1.3.0


 Some follow up work items:
 1. Estimate cache nodes from memory size - currently the user needs to 
 specify size based on #nodes.
 2. Make the AggregateStatsCache#add method asynchronous - adding to cache can 
 happen in a new thread.
 3. Based on perf testing, explore an alternate data structure for the node 
 list per cache key.
 4. Explore ideas to reduce locking granularity of the value list per cache 
 key.
 5. There is an O(n*n) loop while finding the match - that should go away.
 6. Single call to DB to get aggregate for columns not in cache.
 7. Organize metrics capturing in a better way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations

[jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations

[jira] [Commented] (HIVE-10503) Aggregate stats cache: follow up optimizations

3 matches

Site Navigation

Mail list logo

Footer information