Re: Best Practices for Calling Server Side Functions

Real Wes Williams Fri, 15 Apr 2016 17:14:55 -0700

Barry,

Would passing the RegionFunctionContext to the query exception apply whether 
the original function was executed on the Orders region vs the Trades region in 
your example?  If they are colocated I would intuitively think that it _may_ 
not matter, but if so, the side effects would probably be subtle.


To be specific by way of modifying your example, are the following equivalent 
given that Orders and Trades are colocated?

Example 1 - Executing on the Orders region:    
**********************************************
Execution execution = 
FunctionService.onRegion(orderRegion).withFilter(Collections.singleton(cusip));
ResultCollector collector = execution.execute(“TradeQueryFunction");

In the function….
Query query = queryService.newQuery(select * from /Trade where cusip = ‘123');
SelectResults results = (SelectResults) this.query.execute(rfc, new String[] 
{cusip});

Example 1 - Executing on the Trades region:    
**********************************************
Execution execution = 
FunctionService.onRegion(tradeRegion).withFilter(Collections.singleton(cusip));
ResultCollector collector = execution.execute(“TradeQueryFunction");

In the function….
Query query = queryService.newQuery(select * from /Trade where cusip = ‘123');
SelectResults results = (SelectResults) this.query.execute(rfc, new String[] 
{cusip});

And then in the function:

> On Apr 15, 2016, at 7:53 PM, Barry Oglesby <[email protected]> wrote:
> 
> Executing queries in functions can be tricky.
> 
> For executing queries in a function, do something like:
> 
> - invoke the function with onRegion
> - have the function return true from optimizeForWrite so that it is executed 
> only on primary buckets
> - use the Query execute API with a RegionFunctionContext in the function. 
> Otherwise, you could easily end up executing the same query on more than one 
> member.
> 
> If you set a filter, the function (and query) will execute on only the member 
> containing the primary or primaries for that filter.
> 
> Here is an example with trades.
> 
> If you route all trades on a specific cusip to the same bucket using a 
> PartitionResolver, then querying for all trades for a specific cusip can be 
> done efficiently using a Function. The trades could be stored with a simple 
> String key like cusip-id or a complex key containing both the cusip and id. 
> Either way, the PartitionResolver will need to be able to return the cusip 
> for the routing object.
> 
> Invoke the function like:
> 
> Execution execution = 
> FunctionService.onRegion(this.region).withFilter(Collections.singleton(cusip));
> ResultCollector collector = execution.execute("TradeQueryFunction");
> Object result = collector.getResult();
> 
> In the TradeQueryFunction, execute the query like:
> 
> RegionFunctionContext rfc = (RegionFunctionContext) context;
> String cusip = (String) rfc.getFilter().iterator().next();
> SelectResults results = (SelectResults) this.query.execute(rfc, new String[] 
> {cusip});
> 
> Where the query is:
> 
> select * from /Trade where cusip = $1
> 
> This will route the function request to the member whose primary bucket 
> contains the cusip filter. Then it will execute the query on the 
> RegionFunctionContext which will just be the data for that bucket. Note: the 
> PartitionResolver will also need to be able to return the cusip for that 
> filter (which is just the input string itself).
> 
> Here is a some more general info on functions.
> 
> If you're executing a function onRegion with a replicated region, then the 
> function is executed on any member defining that region. Since the region is 
> replicated, every server has the same data.
> 
> If you're executing a function onRegion with a partitioned region, then where 
> the function is invoked depends on the result of optimizeForWrite. If 
> optimizeForWrite returns true, the function is invoked on all the members 
> containing primary buckets for that region. If optimizeForWrite returns 
> false, the function is invoked on as few members as it can that encompass all 
> the buckets (so it mixes primary and secondary buckets). For example if you 
> have 2 members, and the primaries are split between them, then 
> optimizeForWrite returning true means that the function will be invoked on 
> both members. Returning false will cause the function to be invoked on only 
> one member since each member has all the buckets. I almost always have 
> optimizeForWrite return true.
> 
> The onServer/onServers API is used for data-unaware calls (meaning no 
> specific region involved). In the past, I've used it mainly for admin-type 
> behavior like:
> 
> - start/stop gateway senders
> - create regions
> - rebalance
> - assign buckets
> 
> Now, gfsh does a lot of this behavior (maybe all of it), so I don't 
> necessarily need functions to do it anymore.
> 
> One of my favorite onServer use cases is the command pattern using a 
> Request/Response API like:
> 
> - define a Request (like RebalanceCache)-
> - pass it as an argument to a CommandFunction from the client to a server 
> using onServer
> - execute it on the server 
> - return a Response
> 
> One use case for invoking a function from another function is member 
> notification. This can be done with a CacheListener on a replicated region 
> too, but the basic idea is:
> 
> - invoke a function
> - in the function, invoke another function on all the members notifying them 
> something is about to happen
> - do the thing
> - invoke another function on all the members notifying them something has 
> happened
> 
> You need to be careful when invoking one function from another. Depending on 
> what you're doing in the second function, you could get yourself into a 
> distributed deadlock situation.
> 
> I'm not sure this answers all the issues you were seeing, but hopefully it 
> helps.
> 
> Thanks,
> Barry Oglesby
> 
> 
> On Fri, Apr 15, 2016 at 1:36 PM, Matt Ross <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi all,
> 
> I'm involved in a sizable GemFire Project right now that is requiring me to 
> execute Functions in a number of ways, and I wanted to poll the community for 
> some best practices.  So initially I would execute all functions like this. 
> 
> ResultCollector<?, ?> rc = FunctionService.onRegion(region)
>     .withArgs(arguments).execute("my-awesome-function");
> And this worked reliably for quite some time, until I started mixing up 
> functions that were executing on partition redundant data and replicated 
> data.  I initially started having problems with this method when I had this 
> setup.  
> 1 locator, 2 servers,  and executing functions that would run queries on 
> partition redundant and replicated regions.  I started getting this problem 
> where the function would execute on both servers, and the result collector 
> would indeterminately chose a server to return results from.  According to 
> logging statements placed within my function I was able to confirm that the 
> function was being executed twice, on both servers.  We were able to fix this 
> problem by switching from executing on region, to executing on Pool.  The 
> initial logic being since there was replicated data on both servers, the 
> function would execute on both servers(Hyptothesis).  
> Another issue was executing functions from within a function without a 
> function context.  Let's say I have one function that I execute with on Pool, 
> there for it is passed a Function Context.  But when I'm actually in the 
> function I need to execute other functions, some needing a 
> RegionFunctionContext and some just needing a FunctionContext.  Initially I 
> was able to just use a Result Collector and FunctionService.onRegion to get a 
> region context, and then pass my current function context to an instance of a 
> new function
> MyAwesomeFunction myAwesomeFunction= MyAwesomeFunction();
> myAweSomeFunction.execute(functionContext);
> This worked for a time but complexity started rising and more problems came 
> up.  
> So in short I wanted to throw out the blanket question of best practices on 
> using (onRegion/onPool/onServer), calling other functions from within 
> functions, what type of functions should be used on what type of regions, and 
> general design patterns when executing functions.  Thanks!
> Matthew Ross | Data Engineer | Pivotal
> 625 Avenue of the Americas NY, NY 10011
> 516-941-7535 <tel:516-941-7535> | [email protected] <mailto:[email protected]> 
> 
>

Re: Best Practices for Calling Server Side Functions

Reply via email to