On 1 Jun 2015, at 13:47, Jean-Baptiste Note <jbn...@gmail.com<mailto:jbn...@gmail.com>> wrote:
Hi there, I've successfully exported some host/port dynamic combination in slider for Kafka on Yarn; they are made available under publisher/exports/servers on the appmaster (see https://github.com/jbnote/koya/). I'm now trying to access this information (really, service location) in two different ways: * From within slider. Is there a public API that I could use directly in python from other slider instances to get to this information ? -- this is necessary for spawning Kafka mirroring from slider, for instance. From what I can see in storm-slider, the slider binary is directly invoked. The code to look up entries is is in the hadoop-yarn-registry API; shipping in Hadoop 2.6 * From the rest of the world. I was thinking of exporting the data to DNS, and hoped to do this with a zookeeper-monitoring daemon, which is already partially implemented. However, none of my exported data seems to be present in ZK, which I was naively hoping for. Is there something i'm missing ? I find the ZK way perfect, rather than the REST API which as far as I can see will require polling. In python monitoring ZK is a breeze. Can someone familiar with the design intent shed some light on how I should carryout this ? YARN-913 is the registry design; its documented in http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/registry/index.html i 1. everything is (publicly) published to ZK 2. There's an API ( http://hadoop.apache.org/docs/current/api/index.html ) in Java; 3. Slider has a .py client too. It deliberately doesn't publish the full set of documents to the registry; too much data & too high a rate of change is what hits ZK scalability and performance. Instead we have a slider-specific API for publishing sets of configurations, each configuration being served up as : json, xml, properties look at org.apache.slider.server.appmaster.web.rest.publisher.PublisherResource for the specifics, but it essentially comes down to GET configuration sets (JSON) ws/v1/exports/ configuration files of a configuration set GET ws/v1/exports/${configset} retrieve a config ws/v1/exports/${configset}/{configuration}.${suffix} suffix = [xml|json|properties] finally, get a specific property ws/v1/exports/${configset}/{configuration}/${property} regarding python monitoring, our code is in the slider-agent module. Bear in mind that ZK listening isn't that resilient to failures of ZK nodes. Our agent only checks it at startup and then starts polling after the AM fails. The Hive LLAP team are using the YARN registry now, and want to add a TTL field to each entry, this would let the client know when to recheck.