[jira] [Commented] (SOLR-13579) Create resource management API

2019-08-08 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903131#comment-16903131
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Thanks for the review:
 # Yes, it was a bug - there was a missing conditional that checked whether the 
other pool of the same type already has this component.
 # Definitely, but the API is still in flux - I'll add it once the API is 
somewhat stabilized.
 # not yet - again, it requires declaring commands and parameters in a separate 
JSON file, which at this point I think is premature when the implementation 
keeps changing.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-08-06 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901705#comment-16901705
 ] 

Shalin Shekhar Mangar commented on SOLR-13579:
--

Thanks [~ab]. This is looking good. I've done a first pass through the design 
and code. It took a time to wrap my head around it and your jira comments 
describing the use-case and how it works really helped.

I have some initial comments:
# The DefaultResourceManaged has a bug I think. The pool can be created by 
createPool and it is scheduled immediately and added to the resourcePools map 
with the key being the name of the resource pool. So presumably we can create 
multiple pools of the same type which is as per the design. But the 
#registerComponent() method gets the pool for the given name and checks that 
there are no other pools with the same type? AIUI, there are no checks to see 
if the given managed component is actually registered in the other pools of the 
same type? This can be easily demonstrated by changing the 
TestDefaultResourceManagerPool.testBasic method and adding another pool with 
the same type.
# The package-info.java for the managed package can benefit from some of the 
design documentation you have added in this Jira.
# There is no v2 api for the /admin/resources?

I'm going to do another pass and try it out and get back to you.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-08-01 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898311#comment-16898311
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Updated patch refactored to use type-safe interfaces for getting / setting the 
limits and retrieving the monitored values. Also, added a simple unit test.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-31 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897369#comment-16897369
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Ah, this makes perfect sense. I'll try refactoring the API along these lines.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-31 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897347#comment-16897347
 ] 

Hoss Man commented on SOLR-13579:
-


bq. We could perhaps call a type-safe and name-safe component API from a 
generic management API by following a similar convention as the one used in 
SolrPluginUtils.invokeSetters? Or use marker interfaces that also provide 
validation / conversion. I'll look into this.

Unless there's something i'm missing (and that's incredibly likely) I don't 
even think you'd need a SolrPluginUtils.invokeSetters type hack for any of this 
-- except maybe mapping REST commands in the ResourceManagerHandler to methods 
in the ResourceManagerPlugins?

what i was imagining was a more straightfoward subclass/subinterface 
relationship and using generics to tightly couple the ManagedComponent impls to 
the corresponding ResourceManagerPlugins -- so the plugins could hav a 
completey staticly typed APIs for calling methods on the Components.  ala...

{code}
public interface ManagedComponent {
  ManagedComponentId getManagedComponentId();
  ...
}

public abstract ResourceManagerPlugin {
  /** if needed by ResourceManagerHandler or metrics */
  public abstract void setResourceLimits(ManagedComponentId component, 
Map limits);
  /** if needed by ResourceManagerHandler or metrics */
  public abstract Map getResourceLimits(ManagedComponentId 
component);
  ...
  // other general API methods needed for linking/registering type "T" 
components
  // (or Pool) and for "managing" all of them...
  ...
}

public interface ManagedCacheComponent implements ManagedComponent {
  // actual caches implement this, and only have to worry about type specific 
methods
  // for managing their resource realted settings -- nothing about the REST 
API...
  public void setMaxSize(long size);
  public void setMaxRamMB(int maxRamMB);
  public long getMaxSize();
  public int getMaxRamMB();
}

public class CacheManagerPlugin extends 
ResourceManagerPlugin {
  // comncrete impls like this can use the staticly typed get/set methods of 
the concrete
  // ManagedComponent impls in their getResourceLimits/setResourceLimits & 
manage methods
  ...
}
{code}



> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-30 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896204#comment-16896204
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

On the use cases:

bq. CacheManagerPlugin would only ever reduce the maxRamMB setting of some 
caches at run time
Again, the current implementation of {{CacheManagerPlugin}} is a simplistic 
draft.

Ultimately the controlled value of {{maxRamMB}} would be tied proportionally to 
two main factors:
* the {{hitratio}} metric (i.e. caches with low hit rate don't need as much RAM 
so their {{maxRamMB}} would be trimmed down). This is an optimization of 
resource usage.
* and the total {{ramBytesUsed}} across all cores would be used as a hard 
limit, proportionally applied to all caches' {{maxRamMB}}, overriding the above 
optimization if necessary. This is a hard control limit, which indeed is 
related to the current number of cores.

Initial value of {{maxRamMB}} would still come from the config, as it does 
today, but then during runtime it would be modified both up and down from that 
value depending on the situation.

bq. users who want to use these pools need to change the individual cache's 
configured maxRamMB to be much higher then they are today. (potentially to the 
same value as the maxRamMB of the pool?)
I think it would work the other way around - users can specify whatever they 
want, but if the admin sets a total {{maxRamMB}} to a lower value than the 
aggregate value that users requested, their requests will be proportionally 
scaled down (see also above for a finer-grained optimization adjustment, not 
just the hard limit).
So in reality the amount of RAM each core and each cache would get would be 
determined as follows:
* initial value would be set from the config's {{maxRamMB}}, unless it would 
already hit the global limit
* this value would be quickly trimmed down based on the {{hitratio}}, and 
eventually scaled up as the {{hitratio}} increases. Some other metric could be 
used here, too, to make this scale down/up process more efficient.
* if a bunch of other cores were suddenly allocated to the same node it's 
likely that the aggregate {{ramBytesUsed}} would hit the global ceiling and the 
plugin would start trimming down {{maxRamMB}} of each cache in each core 
(possibly using some weighted scheme instead of purely proportional). 
* if the number of cores were to decrease so that their aggregate 
{{ramBytesUsed}} would fall below a percentage of the hard limit, say 80%, the 
plugin could proportionally increase the {{maxRamMB}} so that they equal to eg. 
80% of the hard limit.

bq.  how/when can/should a CacheManagerPlugin assume/recognize that the memory 
pressure has decreased?
Using the {{ramBytesUsed}} metric for the hard limit, and the {{hitratio}} 
metric for optimization.

If {{hitratio}} is high then we need as much RAM as possible to expand the 
cache, until we either hit the core's limit, or the global limit, OR the 
{{hitratio}} falls below a threshold. If {{hitratio}} falls below a threshold 
then we know the cache contains mostly useless items and we can trim down its 
{{maxRamMB}}, which will lead to evictions, which in turn will lead to the 
increased {{hitratio}}.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-30 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896177#comment-16896177
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

On the API itself:

bq. ...but the code in ResourceManagerPlugin is also independent of any 
specific type of resource(s) that a pool can manage
{{ResourceManagerPlugin}} is an interface so it has no code of its own. 
Subclasses implement the actual logic of what to monitor and how to control it, 
so it made sense to make it a separate interface from a pool, which is 
responsible for collecting and aggregating the data from components. As I 
mentioned, I can easily foresee a future 1:N mapping between pool and plugins, 
in order to manage different types of resource limits of a component in one 
pool.

Concrete example of a component that consumes different types of resources that 
we may want to manage is SolrIndexSearcher - here we have caches, merge IO, 
update threads and query threads. We may want to manage all of these aspects by 
registering SolrIndexSearcher in a single pool that supports these types of 
mgmt plugins, instead of registering it in several pools, each managing one 
aspect of the component.

bq. "loose coupling" that currently exists in the patch between the 
ManagedComponent API and ResourceManagerPlugin
I agree, this is an important concern - please remember that this is just an 
initial attempt to cover all bases, and I thought that using a very generic API 
could protect us from the combinatoric explosion of the API between the 
management framework and the different types of components. As you noted, the 
unfortunate downside of this approach is the complexity of validating and 
applying the modifications in the components...
We could perhaps call a type-safe and name-safe component API from a generic 
management API by following a similar convention as the one used in 
{{SolrPluginUtils.invokeSetters}}? Or use marker interfaces that also provide 
validation / conversion. I'll look into this.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-29 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895371#comment-16895371
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

bq. Hmmm, Did you mean to upload a diff patch? the latests i see (#12975831) 
still contains lots of new class names refering to "Resource" instead of 
"Component" ...
I meant the use of "component" where it refers to Solr components - previous 
versions of the patch confusingly referred to these components as "resources", 
hence eg. ManagedResource -> ManagedComponent. Other names are related to the 
management of actual hardware resources (ram, IO, etc.) so I felt the remaining 
class names with ..resource.. are still appropriate here.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-26 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894184#comment-16894184
 ] 

Hoss Man commented on SOLR-13579:
-

Honestly, i'm still very lost.

Part of my struggle is i'm trying to wade into the patch, and review the APIs 
and functionality it contains, while knowing – as you mentioned – that's not 
all the details are here, and it's not fully fleshed out w/everything you 
intend as far as configuration and customization and having more concrete 
implementations beyond just the {{CacheManagerPlugin}}.

I know that in your mind there is more that can/should be done, and that some 
of this code is just "placeholder" for later, but i don't have enough 
familiarity with the "long term" plan to really understand what in the current 
patch is placeholder or stub APIs, vs what is "real" and exists because of long 
term visions for how all of these pieces can be used together in a more 
generalized system – ie: what classes might have surface APIs that look more 
complex then needed given what's currently implemented in the patch, because of 
how you envinsion those classes being used in the future?

Just to pick one example, was my question about the "ResourceManagerPool" vs 
"ResourceManagerPlugin" – in your reply you said...
{quote}The code in ResourceManagerPool is independent of the type of 
resource(s) that a pool can manage. ...
{quote}
...but the code in {{ResourceManagerPlugin}} is _also_ independent of any 
specific type of resource(s) that a pool can manage – those specifics only 
exist in the concrete subclasses. Hence the crux of my question is why theses 
two very generalized pieces of abstract functionality/data collection couldn't 
just be a single abstract base class for all (concrete) ResourceManagerPlugin 
subclasses to extend?

Your followup gives a clue...
{quote}...perhaps at some point we could allow a single pool to manage several 
aspects of a component, in which case a pool could have several plugins.
{quote}
but w/o some "concrete hypothetical" examples of what that might look like, 
it's hard to evaluate if the current APIs are the "best" approach, or if maybe 
there is something better/simpler.
{quote}Also, there can be different pools of the same type, each used for a 
different group of components that support the same management aspect. For 
example, for searcher caches we may want to eventually create separate pools 
for filterCache, queryResultCache and fieldValueCache. All of these pools would 
use the same plugin implementation CacheManagerPlugin but configured with 
different params and limits.
{quote}
But even in this situation, there could be multiple *instances* of a 
{{CacheManagerPlugin}}, one for each pool, each with different params and 
limits, w/o needing distinction between the {{ResourceManagerPlugin}} 
concept/instances and the {{ResourceManagerPool}} concept/instances.

(To be clear, i'm not trying to harp on the specific design/seperation/linkage 
of {{ResourceManagerPlugin}} vs {{ResourceManagerPool}} – these are just some 
of the first classes i looked at and had questions about. I'm just using them 
as examples of where/how it's hard to ask questions or form opinions about the 
current API/code w/o having a better grasp of some "concrete specifcs" (or even 
"hypothetical specifics") of when/how/where/why each of these APIs are expected 
to be used and interact w/each other.

Another example of where i got lost as to the specific motivation behind some 
of these APIs in the long term view is in the "loose coupling" that currently 
exists in the patch between the {{ManagedComponent}} API and 
{{ResourceManagerPlugin}}:
 As i understand it:
 * An object in Solr supports being managed by a particular subclass of 
{{ResourceManagerPlugin}} if and only if it extends {{ManagedComponent}} and 
implementes {{ManagedComponent.getManagedResourceTypes()}} such that the 
resulting {{Collection}} contains a String matching the return value of 
a {{ResourceManagerPlugin.getType()}} for that particular 
{{ResourceManagerPlugin}}
 ** ie: {{SolrCache}} extends the {{ManagedComponent}} interface, and all 
classess implementeing {{SolrCache}} should/must implement 
{{getManagedResourceTypes()}} by returning a java {{Collection}} containing 
{{CacheManagerPlugin.TYPE}}
 * once some {{ManagedComponent}} instances are "registered in a pool" and 
managed by a specific {{ResourceManagerPlugin}} intsance then that plugin 
expects to be able to call {{ManagedComponent.setResourceLimits(Map limits)}} and {{ManagedComponent.getResourceLimits()}} on all of those 
{{ManagedComponent}} instances, and that both Maps should contain/support a set 
of {{String}} keys specific to that {{ResourceManagerPlugin}} subclass acording 
to {{ResourceManagerPlugin.getControlledParams()}}
 ** ie: {{CacheManagerPlugin.getControlledParams()}} returns a java 
{{Collection}} containing 

[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-25 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892869#comment-16892869
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Updated patch:
 * renamed ManagedResource to ManagedComponent
 * consistently use the name "component" instead of the confusing "resource"

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-25 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892682#comment-16892682
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

bq. why is the "ResourceManagerPool" class different from the 
"ResourceManagerPlugin" class?
The code in {{ResourceManagerPool}} is independent of the type of resource(s) 
that a pool can manage. I decided to leave them separate for now - perhaps at 
some point we could allow a single pool to manage several aspects of a 
component, in which case a pool could have several plugins.

Also, there can be different pools of the same type, each used for a different 
group of components that support the same management aspect. For example, for 
searcher caches we may want to eventually create separate pools for 
filterCache, queryResultCache and fieldValueCache. All of these pools would use 
the same plugin implementation {{CacheManagerPlugin}} but configured with 
different params and limits.

bq. What happens if a single ManagedResource is part of two different "pools" 
with two different ResourceManagerPlugins that give conflicting/overlapping 
instructions?

Currently this is not allowed ({{DefaultResourceManager.addResource:183}}). In 
theory, I could imagine a component to belong to more than 1 pool of the same 
type - eg. one being a global per-node pool for coarse-grained control and the 
other being a local per-core pool for fine-grained optimization.

However, at this point my head explodes thinking about all possible bad 
interactions, so the code expressly forbids it. :)

bq. does that imply that once SolrCache(s) are part of a "pool" they no longer 
have their own max size(s)?
They still do - but it's used as the starting point for proportional 
adjustments.

As I mentioned above, the exact details of how the adjustments are distributed 
among all caches are still unclear - in the current patch they are applied 
proportionally to each cache's maxSize / maxRamMB. It should be easy to add 
more complex priorities or weights - I wanted to start with something simple to 
illustrate the concept.

bq. how/where would someone specify a "preference" for ensuring that if a 
"pool" is "full" that certain resources should be managed more agressively then 
others

In the current API that would probably need to be defined somewhere between 
{{SolrCache.getResourceLimits()}} and {{CacheManagerPlugin}}, ie. the cache 
would report its "priority" as one of the limits and the plugin would know what 
to do about it.

bq. Also, FYI: with this patch, we now have 2 "ManagedResource" classes in 
solr/core that have absolutely nothing to do with each other...
Yeah, I'll rename this one to something else.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-25 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892670#comment-16892670
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

The main scenario that prompted this development was a need to control the 
aggregated cache sizes across all cores in a CoreContainer in a multi-tenant 
(uncooperative) situation. However, it seemed like a similar approach would be 
applicable for controlling other runtime usage of resources in a Solr node - 
hence the attempt to come up with a generic framework.

A particular component may support resource management of several of its 
aspects. Eg. a {{SolrIndexSearcher}} can have a "cache" RAM usage aspect, 
"mergeIO" throttling aspect, "mergeThreadCount" aspect, "queryThreadCount" 
aspect, etc. Each of these aspects can be managed by a different global pool 
that defines total resource limits of a given type. Currently a component can 
be registered only in a single pool of a given type, in order to avoid 
conflicting instructions.

In the current patch the component registration and pool creation parts are 
primitive - the default pools are created statically and components are forced 
to register in a dedicated pool. In the future this could be configurable - eg. 
components from cores belonging to different collections may belong to 
different pools with different limits / priorities.

In the following stories, there are always two aspects of resource management - 
control and optimization. The control aspect ensures that the specified hard 
limits are observed, while the optimization aspect ensures that each component 
uses resources in an optimal way. The focus of this JIRA issue is mainly on the 
control aspect, with optimization to follow later.

h2. Story 1: controlling global cache RAM usage in a Solr node
{{SolrIndexSearcher}} caches are currently configured statically, using either 
item count limits or {{maxRamMB}} limits. We can only specify the limit 
per-cache and then we can limit the number of cores in a node to arrive at a 
hard total upper limit.

However, this is not enough because it leads to keeping the heap at the upper 
limit when the actual consumption by caches might be far lesser. It'd be nice 
for a more active core to be able to use more heap for caches than another core 
with less traffic while ensuring that total heap usage never exceeds a given 
threshold (the optimization aspect). It is also required that total heap usage 
of caches doesn't exceed the max threshold to ensure proper behavior of a Solr 
node (the control aspect).

In order to do this we need a control mechanism that is able to adjust 
individual cache sizes per core, based on the total hard limit and the actual 
current "need" of a core, defined as a combination of hit ratio, QPS, and other 
arbitrary quality factors / SLA. This control mechanism also needs to be able 
to forcibly reduce excessive usage (evenly? prioritized by collection's SLA?) 
when the aggregated heap usage exceeds the threshold.

In terms of the proposed API this scenario would work as follows:
 * a global resource pool "searcherCachesPool" is created with a single hard 
limit on eg. total {{maxRamMB}}.
 * this pool knows how to manage components of a "cache" type - what parameters 
to monitor and what parameters to use in order to control their resource usage. 
This logic is encapsulated in {{CacheManagerPlugin}}.
 * all searcher caches from all cores register themselves in this pool for the 
purpose of managing their "cache" aspect.
 * the plugin is executed periodically to check the current resource usage of 
all registered caches, using eg. the aggregated value of {{ramBytesUsed}}.
 * if this aggregated value exceeds the total {{maxRamMB}} limit configured for 
the pool then the plugin adjusts the {{maxRamMB}} setting of each cache in 
order to reduce the total RAM consumption - currently this uses a simple 
proportional formula without any history (the P part of PID), with a dead-band 
in order to avoid thrashing. Also, for now, this addresses only the control 
aspect (exceeding a hard threshold) and not the optimization, i.e. it doesn't 
proactively reduce / increase {{maxRamMB}} based on hit rate.
 * as a result of this action some of the cache content will be evicted sooner 
and more aggressively than initially configured, thus freeing more RAM.
 * when the memory pressure decreases the {{CacheManagerPlugin}} re-adjusts the 
{{maxRamMB}} settings of each cache to the initially configured values. Again, 
the current implementation of this algorithm is very simple but can be easily 
improved because it's cleanly separated from implementation details of each 
cache.

h2. Story 2: controlling global IO usage in a Solr node.

Similarly to the scenario above, currently we can only statically configure 
merge throttling (RateLimiter) per core but we can't monitor and control the 
total IO 

[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-24 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892241#comment-16892241
 ] 

Hoss Man commented on SOLR-13579:
-

I spent some time breifly skimming the patch, and TBH got lost very quickly.

I think it would be helpful (probably to more folks then just myself) if we 
could discuss, in "story" form, some (existing or hypothetical) examples of 
scenerios that could come up; how this new system would be helpful & behave in 
those scenerios, and what classes/objects (either in this patch, or yet to be 
written) would be responsible for each bit of action/reaction in those stories.

ie: I'm a solr cluster admin and I have some existing collections using the 
(existing) default cache configurations. When/why might i want to setup some 
pools? what types of steps would i take to do so? how would my configuration(s) 
change? After i have some pools in place, what's an example of something that 
might happen during runtime that would cause the ResourceManager to "do 
something" with my pools/caches? what would that "do something" look like in 
terms of method call stacks?  what would the effective end result be from my 
perspective as an external observer?

Some specific bits that confuse me as i try to wrap my head around the current 
patch...
 * If each named "pool" has exactly one ResourceManagerPlugin that contains the 
(type specific) actual logic for managinging "the pool" (and the resources 
using that pool) then why is the "ResourceManagerPool" class different from the 
"ResourceManagerPlugin" class?
 ** as opposed to combining that logic into a single common base class?
 ** is there a one-to-many/many-to-one relationship between them that i'm not 
understanding?

 * can you elaborate on this comment with some concrete examples:
{quote}Each managed resource can be managed by multiple types of plugins and it 
may appear in multiple pools (of different types). This reflects the fact that 
a single component may have multiple aspects of resource management - eg. cache 
mgmt, cpu, threads, etc.
{quote}
 ** ie: if "CacheManagerPlugin.TYPE" is one "type" of pool that a SolrCache 
(implements ManagedResource) might be managed by, what would another 
hypothetical "type" of plugin/pool be that SolrCache might also be a part of?
 *** or if you can't think of a good example of two diff types that a SolrCache 
would be managed by, any example of an concept/object in solr that might becom 
a "ManagedResource" that could be managed by two differnt types of polugins as 
part of 2 diff pools would be helpful
 ** What happens if a single ManagedResource is part of two different "pools" 
with two different ResourceManagerPlugins that give conflicting/overlapping 
instructions?

 * regarding this comment...
{quote}Each pool also has plugin-specific parameters, most notably the limits - 
eg. max total cache size, which the CacheManagerPlugin knows how to use in 
order to adjust cache sizes.
{quote}
 ** does that imply that once SolrCache(s) are part of a "pool" they no longer 
have their own max size(s) ? or is the configured max size of an individual 
cache(s) still a hard upper bound on the "managed size" that might be set at 
runtime as the plugins fire?
 ** how/where would someone specify a "preference" for ensuring that if a 
"pool" is "full" that certain resources should be managed more agressively then 
others – ex: imagine a cluster admin wants all collections to have SolrCaches 
that are "as big as possible" given the resources of the machines, but wants to 
give priority to a certain subset of the "important" collections if resources 
get constrained; what/where would that be done?


Also, FYI: with this patch, we now have 2 "ManagedResource" classes in 
solr/core that have absolutely nothing to do with each other...
{noformat}
$ find -name ManagedResource.java
./solr/core/src/java/org/apache/solr/rest/ManagedResource.java
./solr/core/src/java/org/apache/solr/managed/ManagedResource.java
{noformat}
...thats a little weird.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-23 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891285#comment-16891285
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

This updated patch adds the following:
 * integration of SolrIndexSearcher caches into the framework
 * initialization of resource manager and pool limit configurations from 
/clusterprops.json
 * other refactorings

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch, 
> SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-18 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888270#comment-16888270
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Example requests and responses - this uses SOLR-13558 + a little glue to 
register SolrIndexSearcher caches:

{code}
http://localhost:8983/solr/admin/resources?poolAction=list
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"result": {
"searcherCache": {
"type": "cache",
"size": 10,
"limits": {
"maxRamMB": 500
},
"args": {},
"resources": [
"filterCache@7e351be2",
"perSegFilter@4d1a23c9",
"documentCache@225da02a",
"fieldValueCache@90a2ca",
"queryResultCache@5ff5ad0e",
"queryResultCache@15c33adb",
"fieldValueCache@6f100717",
"perSegFilter@4d5cc184",
"filterCache@13f35898",
"documentCache@48dfaca7"
]
}
}
}

http://localhost:8983/solr/admin/resources?resAction=list=searcherCache

{
"responseHeader": {
"status": 0,
"QTime": 0
},
"result": {
"filterCache@7e351be2": {
"class": "org.apache.solr.search.FastLRUCache",
"types": [
"cache"
],
"managedLimits": {
"cleanupThread": false,
"size": 0,
"showItems": 0,
"minSize": 460,
"maxRamMB": -1,
"acceptableSize": 486
}
},
"perSegFilter@4d1a23c9": {
"class": "org.apache.solr.search.LRUCache",
"types": [
"cache"
],
"managedLimits": {
"size": 10,
"maxRamMB": -1
}
},
"documentCache@225da02a": {
"class": "org.apache.solr.search.LRUCache",
"types": [
"cache"
],
"managedLimits": {
"size": 512,
"maxRamMB": -1
}
},
"fieldValueCache@90a2ca": {
"class": "org.apache.solr.search.FastLRUCache",
"types": [
"cache"
],
"managedLimits": {
"cleanupThread": false,
"size": 0,
"showItems": -1,
"minSize": 9000,
"maxRamMB": -1,
"acceptableSize": 9500
}
},
...
{code}

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-18 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887963#comment-16887963
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Updated patch:
* more refactoring :)
* added ResourceManagerHandler under /admin/resources to expose the resource 
management. This handler supports managing the pools (create / delete / status 
/ modify limits) as well as resources (similarly).

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch, SOLR-13579.patch
>
>
> Resource management framework API supporting the goals outlined in SOLR-13578.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-16 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886014#comment-16886014
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

Updated patch, with one significant change (based on the work in SOLR-13558): 
allow arbitrary limit types, ie. Object instead of Float. This way the API can 
support controllable parameters that are expressed as eg. booleans, enums, etc.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch, SOLR-13579.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-09 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881572#comment-16881572
 ] 

David Smiley commented on SOLR-13579:
-

Oh I see this is a sub-task, and the parent task has a fine description.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-09 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881571#comment-16881571
 ] 

David Smiley commented on SOLR-13579:
-

Could you please add an issue description?  The title is not so self 
explanatory so as to excuse you from writing one.  It's a little unclear to me 
what the objective is.  Node-wide cache management seems to be just one example 
or is that the whole point?  What might other purposes be?  I could use my 
imagination but I'd rather the proposal spell it out for us.  Thanks.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13579) Create resource management API

2019-07-04 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878603#comment-16878603
 ] 

Andrzej Bialecki  commented on SOLR-13579:
--

This patch contains a draft of the API, with some details of the implementation 
fleshed out, up for discussion - there are no tests and no integration with any 
existing component yet. :)

A high-level design overview:
 * A {{ResourceManager}} manages multiple named pools of resources (flat 
hierarchy for now). A default instance of {{ResourceManager}} would be created 
at a {{CoreContainer}} level so that it can manage global limits for a Solr 
node.
 * Each pool knows how to perform a single specific type of management. This 
handling is actually performed by a {{ResourceManagerPlugin}}, which knows what 
monitored values to retrieve from resources, and knows how to adjust the 
controlled parameters of managed resources.
 * There can be multiple pools of the same type (under different names) - they 
will likely differ in their parameter. Eg. document cache size may be checked 
every 1 min and have one limit, but the query / filter cache size may use 
different parameters, even though the set of monitored parameters and 
controlled parameters are the same (hence the same type).
 * {{ResourceManager}} is responsible for periodically executing the 
{{ResourceManagerPlugin}} of each of the pools, so that it can verify and 
adjust the resources it manages in the pool.
 * Each pool has its own parameters - currently the only global parameter is 
scheduleDelaySeconds, which determines how often the pool will run the 
management plugin to verify and adjust the resource usage.
 * Each pool also has plugin-specific parameters, most notably the limits - eg. 
max total cache size, which the CacheManagerPlugin knows how to use in order to 
adjust cache sizes.
 * Each managed resource can be managed by multiple types of plugins and it may 
appear in multiple pools (of different types). This reflects the fact that a 
single component may have multiple aspects of resource management - eg. cache 
mgmt, cpu, threads, etc.

The patch also contains an example implementation of a management plugin - 
{{CacheManagerPlugin}}. This plugin uses the API to enforce global limits on 
the cache size. It knows how to retrieve and calculate the current resource 
usage, as reported by the monitored values, and then it adjusts the controlled 
limits of each resource to bring the usage back to the total values that fit 
within the limits defined by the pool. In this case the pool can define global 
limits on the cache {{size}} and {{maxRamMB}} (and these are also the 
parameters to control for each cache), and the plugin uses {{size}} and 
{{ramBytesUsed}} for monitoring the actual resource consumption.

Obviously {{SolrCache}} doesn't implement this API yet, but it's relatively 
easy to add.

I'd appreciate review, comments and suggestions.

> Create resource management API
> --
>
> Key: SOLR-13579
> URL: https://issues.apache.org/jira/browse/SOLR-13579
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Attachments: SOLR-13579.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org