Re: Best Practices for Calling Server Side Functions

Barry Oglesby Fri, 15 Apr 2016 18:08:29 -0700

Wes,

Because your regions are colocated, your example actually works. I'm not
sure why you'd do this, and I'm not sure I'd recommend it.


Under the covers, the query determines it is on the Trade region. Then it
gets the buckets (set of integers) from the RegionFunctionContext. Then,
the query, parameters and buckets are passed to the region to be executed.

So, the only thing the RFC is used for is to get the appropriate buckets,
and they should be the same in either case.

You might see some issues with this idea when buckets are moving around
during a rebalance. You'd have to test in that scenario to verify.


Thanks,
Barry Oglesby


On Fri, Apr 15, 2016 at 5:14 PM, Real Wes Williams <[email protected]>
wrote:

> Barry,
>
> Would passing the RegionFunctionContext to the query exception apply
> whether the original function was executed on the Orders region vs the
> Trades region in your example?  If they are colocated I would intuitively
> think that it _may_ not matter, but if so, the side effects would probably
> be subtle.
>
> To be specific by way of modifying your example, are the following
> equivalent given that Orders and Trades are colocated?
>
> Example 1 - Executing on the Orders region:
>  **********************************************
> Execution execution = FunctionService.onRegion(*orderRegion*
> ).withFilter(Collections.singleton(cusip));
> ResultCollector collector = execution.execute(“TradeQueryFunction");
>
> In the function….
> Query query = queryService.newQuery(select * from /Trade where cusip =
> ‘123');
> SelectResults results = (SelectResults) this.query.execute(*rfc*, new
> String[] {cusip});
>
> Example 1 - Executing on the Trades region:
>  **********************************************
> Execution execution = FunctionService.onRegion(*tradeRegion*
> ).withFilter(Collections.singleton(cusip));
> ResultCollector collector = execution.execute(“TradeQueryFunction");
>
> In the function….
> Query query = queryService.newQuery(select * from /Trade where cusip =
> ‘123');
> SelectResults results = (SelectResults) this.query.execute(*rfc*, new
> String[] {cusip});
>
> And then in the function:
>
> On Apr 15, 2016, at 7:53 PM, Barry Oglesby <[email protected]> wrote:
>
> Executing queries in functions can be tricky.
>
> For executing queries in a function, do something like:
>
> - invoke the function with onRegion
> - have the function return true from optimizeForWrite so that it is
> executed only on primary buckets
> - use the Query execute API with a RegionFunctionContext in the function.
> Otherwise, you could easily end up executing the same query on more than
> one member.
>
> If you set a filter, the function (and query) will execute on only the
> member containing the primary or primaries for that filter.
>
> Here is an example with trades.
>
> If you route all trades on a specific cusip to the same bucket using a
> PartitionResolver, then querying for all trades for a specific cusip can be
> done efficiently using a Function. The trades could be stored with a simple
> String key like cusip-id or a complex key containing both the cusip and id.
> Either way, the PartitionResolver will need to be able to return the cusip
> for the routing object.
>
> Invoke the function like:
>
> Execution execution =
> FunctionService.onRegion(this.region).withFilter(Collections.singleton(cusip));
> ResultCollector collector = execution.execute("TradeQueryFunction");
> Object result = collector.getResult();
>
> In the TradeQueryFunction, execute the query like:
>
> RegionFunctionContext rfc = (RegionFunctionContext) context;
> String cusip = (String) rfc.getFilter().iterator().next();
> SelectResults results = (SelectResults) this.query.execute(rfc, new
> String[] {cusip});
>
> Where the query is:
>
> select * from /Trade where cusip = $1
>
> This will route the function request to the member whose primary bucket
> contains the cusip filter. Then it will execute the query on the
> RegionFunctionContext which will just be the data for that bucket. Note:
> the PartitionResolver will also need to be able to return the cusip for
> that filter (which is just the input string itself).
>
> Here is a some more general info on functions.
>
> If you're executing a function onRegion with a replicated region, then the
> function is executed on any member defining that region. Since the region
> is replicated, every server has the same data.
>
> If you're executing a function onRegion with a partitioned region, then
> where the function is invoked depends on the result of optimizeForWrite. If
> optimizeForWrite returns true, the function is invoked on all the members
> containing primary buckets for that region. If optimizeForWrite returns
> false, the function is invoked on as few members as it can that encompass
> all the buckets (so it mixes primary and secondary buckets). For example if
> you have 2 members, and the primaries are split between them, then
> optimizeForWrite returning true means that the function will be invoked on
> both members. Returning false will cause the function to be invoked on only
> one member since each member has all the buckets. I almost always have
> optimizeForWrite return true.
>
> The onServer/onServers API is used for data-unaware calls (meaning no
> specific region involved). In the past, I've used it mainly for admin-type
> behavior like:
>
> - start/stop gateway senders
> - create regions
> - rebalance
> - assign buckets
>
> Now, gfsh does a lot of this behavior (maybe all of it), so I don't
> necessarily need functions to do it anymore.
>
> One of my favorite onServer use cases is the command pattern using a
> Request/Response API like:
>
> - define a Request (like RebalanceCache)-
> - pass it as an argument to a CommandFunction from the client to a server
> using onServer
> - execute it on the server
> - return a Response
>
> One use case for invoking a function from another function is member
> notification. This can be done with a CacheListener on a replicated region
> too, but the basic idea is:
>
> - invoke a function
> - in the function, invoke another function on all the members notifying
> them something is about to happen
> - do the thing
> - invoke another function on all the members notifying them something has
> happened
>
> You need to be careful when invoking one function from another. Depending
> on what you're doing in the second function, you could get yourself into a
> distributed deadlock situation.
>
> I'm not sure this answers all the issues you were seeing, but hopefully it
> helps.
>
> Thanks,
> Barry Oglesby
>
>
> On Fri, Apr 15, 2016 at 1:36 PM, Matt Ross <[email protected]> wrote:
>
>> Hi all,
>>
>> I'm involved in a sizable GemFire Project right now that is requiring me
>> to execute Functions in a number of ways, and I wanted to poll the
>> community for some best practices.  So initially I would execute all
>> functions like this.
>>
>> ResultCollector<?, ?> rc = FunctionService.onRegion(region)
>>     .withArgs(arguments).execute("my-awesome-function");
>>
>> And this worked reliably for quite some time, until I started mixing up 
>> functions that were executing on partition redundant data and replicated 
>> data.  I initially started having problems with this method when I had this 
>> setup.
>>
>> 1 locator, 2 servers,  and executing functions that would run queries on 
>> partition redundant and replicated regions.  I started getting this problem 
>> where the function would execute on both servers, and the result collector 
>> would indeterminately chose a server to return results from.  According to 
>> logging statements placed within my function I was able to confirm that the 
>> function was being executed twice, on both servers.  We were able to fix 
>> this problem by switching from executing on region, to executing on Pool.  
>> The initial logic being since there was replicated data on both servers, the 
>> function would execute on both servers(Hyptothesis).
>>
>> Another issue was executing functions from within a function without a 
>> function context.  Let's say I have one function that I execute with on 
>> Pool, there for it is passed a Function Context.  But when I'm actually in 
>> the function I need to execute other functions, some needing a 
>> RegionFunctionContext and some just needing a FunctionContext.  Initially I 
>> was able to just use a Result Collector and FunctionService.onRegion to get 
>> a region context, and then pass my current function context to an instance 
>> of a new function
>>
>> MyAwesomeFunction myAwesomeFunction= MyAwesomeFunction();
>>
>> myAweSomeFunction.execute(functionContext);
>>
>> This worked for a time but complexity started rising and more problems came 
>> up.
>>
>> So in short I wanted to throw out the blanket question of best practices on 
>> using (onRegion/onPool/onServer), calling other functions from within 
>> functions, what type of functions should be used on what type of regions, 
>> and general design patterns when executing functions.  Thanks!
>>
>> *Matthew Ross | Data Engineer | Pivotal*
>> *625 Avenue of the Americas NY, NY 10011*
>> *516-941-7535 <516-941-7535> | [email protected] <[email protected]> *
>>
>>
>
>

Re: Best Practices for Calling Server Side Functions

Reply via email to