All,

I'm implementing a Solr Cloud proxy in Knox and need help understanding how
to implement topology roles that use a late binding 'discovery' service to
derive the destination host.

The implementation uses gateway-service-definitions and a custom dispatch
in a new module called gateway-service-solrcloud.

Here are the two rewrite rules as a reference - which are just copies of
WebHCat's:

<rules>

<rule dir="IN" name="SOLRCLOUD/solrcloud/root/inbound"
pattern="*://*:*/**/solr/?{**}">
<rewrite template="{$serviceUrl[SOLRCLOUD]}/?{**}"/>
</rule>

<rule dir="IN" name="SOLRCLOUD/solrcloud/path/inbound"
pattern="*://*:*/**/solr/{path=**}?{**}">
<rewrite template="{$serviceUrl[SOLRCLOUD]}/{path=**}?{**}"/>
</rule>

</rules>

Right now my topology file includes a single Solr server hostname for the
SOLRCLOUD role that acts as a dummy place holder.  The custom dispatch
queries Zookeeper for Solr hosts, rewrites the outboundRequest and sends it
off to an active Solr host.  The ZK host is hard coded in the dispatch at
this point and it is all working fine.

The next step is to put a comma separated list of zk_hostname:port in the
topology file for the SOLRCLOUD role, in place of any known Solr hosts.

I'm not sure what my rewrite looks like with my intentions and really
assume that I will need a new provider that understands that it isn't
simply rewriting a URI, but rather triggering the query to ZK. Perhaps I
can inject that process first and then tell Knox to rewrite with the URI I
get back from ZK. From the WebHDFS HA dispatch code, I learned that you can
trigger the rewrite rules to run again with something like:

//null out target url so that rewriters run again
inboundRequest.setAttribute(AbstractGatewayFilter.TARGET_REQUEST_URL_ATTRIBUTE_NAME,
null);

It looks like I have access to everything I need in
DefaultDispatch.executeRequest(...) to make the complete query myself:

//Knox path info to omit
2015-05-28 01:52:52,180 DEBUG hadoop.gateway
(SolrCloudDispatch.java:executeRequest(51)) - inboundRequest Attribute
getContextPath() is /gateway/sandbox
//starts the important service info here
2015-05-28 01:52:52,180 DEBUG hadoop.gateway
(SolrCloudDispatch.java:executeRequest(52)) - inboundRequest Attribute
getPathInfo() is /solr/gettingstarted/select

If a query exists then append the query: '?'+getQueryString()


So...my questions are: How do I put my Zookeeper server list in the
topology file and tell the whole process to forget about rewriting and let
me do it in the dispatch?  Or is that the wrong approach?


Thanks,

Kris

Reply via email to