Thanks Barry. This is a needed thread. Before leaving RegionFunctionContext, I execute queries often within a function.onRegion. Is there a good reason why we don’t support passing the RFC into a query when using bind parameters? If not, I’d like to add that to a Geode enhancement to eke out even more performance. In performance tests using a region with only 2,000 entries with a small number of nodes, I did not see a performance difference between: A) executing a query using RFC, and B) executing a query with bind parameters not using RFC
although supporting the RFC with B should theoretically be even faster. Regards, Wes Williams http://gemfire81.docs.pivotal.io/docs-gemfire/developing/query_additional/using_query_bind_parameters.html#concept_173E775FE46B47DF9D7D1E40680D34DF <http://gemfire81.docs.pivotal.io/docs-gemfire/developing/query_additional/using_query_bind_parameters.html#concept_173E775FE46B47DF9D7D1E40680D34DF> > On Apr 15, 2016, at 9:07 PM, Barry Oglesby <[email protected]> wrote: > > Wes, > > Because your regions are colocated, your example actually works. I'm not sure > why you'd do this, and I'm not sure I'd recommend it. > > Under the covers, the query determines it is on the Trade region. Then it > gets the buckets (set of integers) from the RegionFunctionContext. Then, the > query, parameters and buckets are passed to the region to be executed. > > So, the only thing the RFC is used for is to get the appropriate buckets, and > they should be the same in either case. > > You might see some issues with this idea when buckets are moving around > during a rebalance. You'd have to test in that scenario to verify. > > > Thanks, > Barry Oglesby > > > On Fri, Apr 15, 2016 at 5:14 PM, Real Wes Williams <[email protected] > <mailto:[email protected]>> wrote: > Barry, > > Would passing the RegionFunctionContext to the query exception apply whether > the original function was executed on the Orders region vs the Trades region > in your example? If they are colocated I would intuitively think that it > _may_ not matter, but if so, the side effects would probably be subtle. > > To be specific by way of modifying your example, are the following equivalent > given that Orders and Trades are colocated? > > Example 1 - Executing on the Orders region: > ********************************************** > Execution execution = > FunctionService.onRegion(orderRegion).withFilter(Collections.singleton(cusip)); > ResultCollector collector = execution.execute(“TradeQueryFunction"); > > In the function…. > Query query = queryService.newQuery(select * from /Trade where cusip = ‘123'); > SelectResults results = (SelectResults) this.query.execute(rfc, new String[] > {cusip}); > > Example 1 - Executing on the Trades region: > ********************************************** > Execution execution = > FunctionService.onRegion(tradeRegion).withFilter(Collections.singleton(cusip)); > ResultCollector collector = execution.execute(“TradeQueryFunction"); > > In the function…. > Query query = queryService.newQuery(select * from /Trade where cusip = ‘123'); > SelectResults results = (SelectResults) this.query.execute(rfc, new String[] > {cusip}); > > And then in the function: > >> On Apr 15, 2016, at 7:53 PM, Barry Oglesby <[email protected] >> <mailto:[email protected]>> wrote: >> >> Executing queries in functions can be tricky. >> >> For executing queries in a function, do something like: >> >> - invoke the function with onRegion >> - have the function return true from optimizeForWrite so that it is executed >> only on primary buckets >> - use the Query execute API with a RegionFunctionContext in the function. >> Otherwise, you could easily end up executing the same query on more than one >> member. >> >> If you set a filter, the function (and query) will execute on only the >> member containing the primary or primaries for that filter. >> >> Here is an example with trades. >> >> If you route all trades on a specific cusip to the same bucket using a >> PartitionResolver, then querying for all trades for a specific cusip can be >> done efficiently using a Function. The trades could be stored with a simple >> String key like cusip-id or a complex key containing both the cusip and id. >> Either way, the PartitionResolver will need to be able to return the cusip >> for the routing object. >> >> Invoke the function like: >> >> Execution execution = >> FunctionService.onRegion(this.region).withFilter(Collections.singleton(cusip)); >> ResultCollector collector = execution.execute("TradeQueryFunction"); >> Object result = collector.getResult(); >> >> In the TradeQueryFunction, execute the query like: >> >> RegionFunctionContext rfc = (RegionFunctionContext) context; >> String cusip = (String) rfc.getFilter().iterator().next(); >> SelectResults results = (SelectResults) this.query.execute(rfc, new String[] >> {cusip}); >> >> Where the query is: >> >> select * from /Trade where cusip = $1 >> >> This will route the function request to the member whose primary bucket >> contains the cusip filter. Then it will execute the query on the >> RegionFunctionContext which will just be the data for that bucket. Note: the >> PartitionResolver will also need to be able to return the cusip for that >> filter (which is just the input string itself). >> >> Here is a some more general info on functions. >> >> If you're executing a function onRegion with a replicated region, then the >> function is executed on any member defining that region. Since the region is >> replicated, every server has the same data. >> >> If you're executing a function onRegion with a partitioned region, then >> where the function is invoked depends on the result of optimizeForWrite. If >> optimizeForWrite returns true, the function is invoked on all the members >> containing primary buckets for that region. If optimizeForWrite returns >> false, the function is invoked on as few members as it can that encompass >> all the buckets (so it mixes primary and secondary buckets). For example if >> you have 2 members, and the primaries are split between them, then >> optimizeForWrite returning true means that the function will be invoked on >> both members. Returning false will cause the function to be invoked on only >> one member since each member has all the buckets. I almost always have >> optimizeForWrite return true. >> >> The onServer/onServers API is used for data-unaware calls (meaning no >> specific region involved). In the past, I've used it mainly for admin-type >> behavior like: >> >> - start/stop gateway senders >> - create regions >> - rebalance >> - assign buckets >> >> Now, gfsh does a lot of this behavior (maybe all of it), so I don't >> necessarily need functions to do it anymore. >> >> One of my favorite onServer use cases is the command pattern using a >> Request/Response API like: >> >> - define a Request (like RebalanceCache)- >> - pass it as an argument to a CommandFunction from the client to a server >> using onServer >> - execute it on the server >> - return a Response >> >> One use case for invoking a function from another function is member >> notification. This can be done with a CacheListener on a replicated region >> too, but the basic idea is: >> >> - invoke a function >> - in the function, invoke another function on all the members notifying them >> something is about to happen >> - do the thing >> - invoke another function on all the members notifying them something has >> happened >> >> You need to be careful when invoking one function from another. Depending on >> what you're doing in the second function, you could get yourself into a >> distributed deadlock situation. >> >> I'm not sure this answers all the issues you were seeing, but hopefully it >> helps. >> >> Thanks, >> Barry Oglesby >> >> >> On Fri, Apr 15, 2016 at 1:36 PM, Matt Ross <[email protected] >> <mailto:[email protected]>> wrote: >> Hi all, >> >> I'm involved in a sizable GemFire Project right now that is requiring me to >> execute Functions in a number of ways, and I wanted to poll the community >> for some best practices. So initially I would execute all functions like >> this. >> >> ResultCollector<?, ?> rc = FunctionService.onRegion(region) >> .withArgs(arguments).execute("my-awesome-function"); >> And this worked reliably for quite some time, until I started mixing up >> functions that were executing on partition redundant data and replicated >> data. I initially started having problems with this method when I had this >> setup. >> 1 locator, 2 servers, and executing functions that would run queries on >> partition redundant and replicated regions. I started getting this problem >> where the function would execute on both servers, and the result collector >> would indeterminately chose a server to return results from. According to >> logging statements placed within my function I was able to confirm that the >> function was being executed twice, on both servers. We were able to fix >> this problem by switching from executing on region, to executing on Pool. >> The initial logic being since there was replicated data on both servers, the >> function would execute on both servers(Hyptothesis). >> Another issue was executing functions from within a function without a >> function context. Let's say I have one function that I execute with on >> Pool, there for it is passed a Function Context. But when I'm actually in >> the function I need to execute other functions, some needing a >> RegionFunctionContext and some just needing a FunctionContext. Initially I >> was able to just use a Result Collector and FunctionService.onRegion to get >> a region context, and then pass my current function context to an instance >> of a new function >> MyAwesomeFunction myAwesomeFunction= MyAwesomeFunction(); >> myAweSomeFunction.execute(functionContext); >> This worked for a time but complexity started rising and more problems came >> up. >> So in short I wanted to throw out the blanket question of best practices on >> using (onRegion/onPool/onServer), calling other functions from within >> functions, what type of functions should be used on what type of regions, >> and general design patterns when executing functions. Thanks! >> Matthew Ross | Data Engineer | Pivotal >> 625 Avenue of the Americas NY, NY 10011 >> 516-941-7535 <tel:516-941-7535> | [email protected] <mailto:[email protected]> >> >> > >
