Hey Kris,

Well I can try and relay the ³vision².  The ideal ³vision² would have been
to have the dispatch be able to communicate with the framework so that
$serviceUrl[SOLRCLOUD] in your rewrite rules would do the right thing.  We
certainly aren¹t there yet.  So the next best thing from my perspective
would be to implement a custom solrcloud rewrite function.  Now some
pointers that will hopefully show where we are and layout a pattern.  Keep
in mind that these would all be implemented in your
gateway-service-solrcloud module.

gateway-provider-rewrite-func-service-registry/src/main/java/org/apache/had
oop/gateway/svcregfunc/api/ServiceUrlFunctionDescriptor.java
Here you will see how do ³declare² a rewrite function.

gateway-provider-rewrite-func-service-registry/src/main/java/org/apache/had
oop/gateway/svcregfunc/impl/ServiceUrlFunctionProcessor.java
Here you will see the implementation of the current ³servicUrl² rewrite
function.  The most important part is that call to lookupServiceUrl.  If
you dig through enough you will see how this is hooked up to the current
HA stuff.  In particular note how WebHdfsHDispatch uses HaProvider to
ultimately interact with ServicRegistryFunctionProcessorBase.  You may be
able to use HaProvider actually.


gateway-provider-rewrite-func-service-registry/src/main/java/org/apache/had
oop/gateway/svcregfunc/impl/ServiceRegistryFunctionProcessorBase.java
This is where lookupServiceUrl is implemented.  Note that in the
initialize method the provided environment.  The environment.getAttribute
basically boils down to a ServletContext.getAttribute call so you can use
this to share state with your dispatch impl.

gateway-provider-rewrite-func-service-registry/src/main/resources/META-INF/
services/org.apache.hadoop.gateway.filter.rewrite.api.UrlRewriteFunctionDes
criptor
gateway-provider-rewrite-func-service-registry/src/main/resources/META-INF/
services/org.apache.hadoop.gateway.filter.rewrite.spi.UrlRewriteFunctionPro
cessor
Like everything else in Knox we use service loaders to find things so you
module will need files like these for the rewrite system to find your
rewrite function.

NowŠ The outstanding question for me is the list is Zookeeper URLs.  We
may want to treat them like we do the NAMENODE and JOBTRACKER services
today which are in the topology (and therefore in the service registry)
but not really exposed.

Kevin.

On 5/28/15, 9:34 AM, "Kristopher Kane" <[email protected]> wrote:

>All,
>
>I'm implementing a Solr Cloud proxy in Knox and need help understanding
>how
>to implement topology roles that use a late binding 'discovery' service to
>derive the destination host.
>
>The implementation uses gateway-service-definitions and a custom dispatch
>in a new module called gateway-service-solrcloud.
>
>Here are the two rewrite rules as a reference - which are just copies of
>WebHCat's:
>
><rules>
>
><rule dir="IN" name="SOLRCLOUD/solrcloud/root/inbound"
>pattern="*://*:*/**/solr/?{**}">
><rewrite template="{$serviceUrl[SOLRCLOUD]}/?{**}"/>
></rule>
>
><rule dir="IN" name="SOLRCLOUD/solrcloud/path/inbound"
>pattern="*://*:*/**/solr/{path=**}?{**}">
><rewrite template="{$serviceUrl[SOLRCLOUD]}/{path=**}?{**}"/>
></rule>
>
></rules>
>
>Right now my topology file includes a single Solr server hostname for the
>SOLRCLOUD role that acts as a dummy place holder.  The custom dispatch
>queries Zookeeper for Solr hosts, rewrites the outboundRequest and sends
>it
>off to an active Solr host.  The ZK host is hard coded in the dispatch at
>this point and it is all working fine.
>
>The next step is to put a comma separated list of zk_hostname:port in the
>topology file for the SOLRCLOUD role, in place of any known Solr hosts.
>
>I'm not sure what my rewrite looks like with my intentions and really
>assume that I will need a new provider that understands that it isn't
>simply rewriting a URI, but rather triggering the query to ZK. Perhaps I
>can inject that process first and then tell Knox to rewrite with the URI I
>get back from ZK. From the WebHDFS HA dispatch code, I learned that you
>can
>trigger the rewrite rules to run again with something like:
>
>//null out target url so that rewriters run again
>inboundRequest.setAttribute(AbstractGatewayFilter.TARGET_REQUEST_URL_ATTRI
>BUTE_NAME,
>null);
>
>It looks like I have access to everything I need in
>DefaultDispatch.executeRequest(...) to make the complete query myself:
>
>//Knox path info to omit
>2015-05-28 01:52:52,180 DEBUG hadoop.gateway
>(SolrCloudDispatch.java:executeRequest(51)) - inboundRequest Attribute
>getContextPath() is /gateway/sandbox
>//starts the important service info here
>2015-05-28 01:52:52,180 DEBUG hadoop.gateway
>(SolrCloudDispatch.java:executeRequest(52)) - inboundRequest Attribute
>getPathInfo() is /solr/gettingstarted/select
>
>If a query exists then append the query: '?'+getQueryString()
>
>
>So...my questions are: How do I put my Zookeeper server list in the
>topology file and tell the whole process to forget about rewriting and let
>me do it in the dispatch?  Or is that the wrong approach?
>
>
>Thanks,
>
>Kris

Reply via email to