It would make sense to use nio rather than threaded io.
On Nov 20, 2012, at 2:06 PM, David Arthur <mum...@gmail.com> wrote: > BTW, here are some cURL calls from my test environment: > > https://gist.github.com/e59b9c8ee4ae56dad44f > > > On Nov 20, 2012, at 4:08 PM, David Arthur wrote: > >> Another bump for this thread... >> >> For those just joining, this prototype is a simple HTTP server that proxies >> the complex consumer code through two HTTP endpoints. >> >> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala >> >> E.g., >> >> curl http://localhost:8888/my-topic -X POST -d 'Here is a message' >> >> and >> >> curl http://localhost:8888/my-topic/my-group -X GET >> >> >> This is not an attempt to expose the FetchRequest/ProduceRequest protocol >> over HTTP. >> >> Few questions: >> >> * Would including offsets be useful here? Since it is utilizing the >> ZK-backed consumer code, I would think not >> * I have chosen to create one thread per topic+group (mostly for simplicity >> sake). Multiple REST servers could be run and load balanced across to >> increase the consumer parallelism. Maybe it would make sense for an >> individual REST server to create more than one thread per topic+group? >> >> Cheers >> -David >> >> On Sep 10, 2012, at 9:49 AM, David Arthur wrote: >> >>> Bump. >>> >>> Anyone have feedback on this approach? >>> >>> -David >>> >>> On Aug 24, 2012, at 12:37 PM, David Arthur wrote: >>> >>>> Here is an initial pass at a Kafka REST proxy (in Scala) >>>> >>>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala >>>> >>>> The basic gist is: >>>> * Jetty for webserver >>>> * Messages are strings >>>> * GET /topic/group to get a message (timeout after 1s) >>>> * POST /topic, the request body is the message >>>> * One consumer thread per topic+group >>>> >>>> Be wary, many things are hard coded at this point (port numbers, etc). >>>> Obviously, this will need to change. Also, I haven't the slightest idea >>>> how to setup/use sbt properly, so I just checked in the libs. >>>> >>>> Feedback is welcome in this thread or on Github. Be gentle please, this >>>> is my first go at Scala >>>> >>>> -David >>>> >>>> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: >>>> >>>>> Jay I agree with you 100%. >>>>> >>>>> At Tagged we have implemented a proxy for various internal reasons ( >>>>> primarily to act as a high performance relay from PHP to Kafka). It's >>>>> implemented in Node.js (JavaScript) >>>>> >>>>> Currently it services UDP packets encoded in binary but it could >>>>> easily be modified to accept http also since Node support for http is >>>>> pretty simple. >>>>> >>>>> If others are interested in maintaining something like this we could >>>>> consider adding this to the public domain along side the already >>>>> existing Node.js client implementation. >>>>> >>>>> >>>>> >>>>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >>>>> >>>>>> My personal preference would be to have only a single protocol in kafka >>>>>> core. I have been down the multiple protocol route and my experience was >>>>>> that it adds a lot of burden for each change that needs to be made and a >>>>>> lot of complexity to abstract over the different protocols. From the >>>>>> point >>>>>> of view of a user they are generally a bit agnostic as to how bytes are >>>>>> sent back and forth provided it is reliable and easily implementable in >>>>>> any >>>>>> language. Generally they care more about the quality of the client in >>>>>> their >>>>>> language of choice. >>>>>> >>>>>> My belief is that the main benefit of REST is ease of implementing a >>>>>> client. But currently the biggest barrier is really the use of zk and >>>>>> fairly thick consumer design. So I think the current thinking is that we >>>>>> should focus on thinning that out and removing the client-side zk >>>>>> dependency. I actually don't think TCP is a huge burden if the protocol >>>>>> is >>>>>> simple, and there are actually some advantages (for example the consumer >>>>>> needs to consume from multiple servers so select/poll/epoll is natural >>>>>> but >>>>>> this is not always available from HTTP client libraries). >>>>>> >>>>>> Basically this is an area where I think it is best to pick one way and >>>>>> really make it really bullet proof rather than providing lots of options. >>>>>> In some sense each option tends to increase the complexity of testing >>>>>> (since now there are many combinations to try) and also of implementation >>>>>> (since now a lot things that were concrete now need to be abstracted >>>>>> away). >>>>>> >>>>>> So from this perspective I would prefer a standalone proxy that could >>>>>> evolve independently rather than retro-fitting the current socket server >>>>>> to >>>>>> handle other protocols. There will be some overhead for the extra hop, >>>>>> but >>>>>> then there is some overhead for HTTP itself. >>>>>> >>>>>> This is just my personal opinion, it would be great to hear what other >>>>>> think. >>>>>> >>>>>> -Jay >>>>>> >>>>>> On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <mum...@gmail.com> wrote: >>>>>> >>>>>>> I'd be happy to collaborate on this, though it's been a while since I've >>>>>>> used PHP. >>>>>>> >>>>>>> From what it looks like, what you have is a true proxy that runs outside >>>>>>> of Kafka and translates some REST routes into Kafka client calls. This >>>>>>> sounds more in line with what the project page describes. What I have >>>>>>> proposed is more like a translation layer between some REST routes and >>>>>>> FetchRequests. In this case the client is responsible for managing >>>>>>> offsets. >>>>>>> Using the consumer groups and ZooKeeper would be another nice way of >>>>>>> consuming messages (which is probably more like what you have). >>>>>>> >>>>>>> Any maintainers have feedback on this? >>>>>>> >>>>>>> On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: >>>>>>> >>>>>>>> I have an internal one working and was hoping to have it open sourced >>>>>>>> in >>>>>>>> the next week. The one at Box is based on the CodeIgniter framework, we >>>>>>>> have about 45 RESTful interfaces built on this framework so I just put >>>>>>>> together another one for Kafka. >>>>>>>> >>>>>>>> >>>>>>>> Here are my notes, these were pre-dev so may be a little different than >>>>>>>> what we ended up with. >>>>>>>> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal >>>>>>>> >>>>>>>> I will read yours later this afternoon, we should work together. >>>>>>>> >>>>>>>> -Jonathan >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <mum...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I'd like to tackle this project (assuming it hasn't been started yet). >>>>>>>>> >>>>>>>>> I wrote up some initial thoughts here: https://gist.github.com/3248179 >>>>>>>>> >>>>>>>>> TLDR; use Range header for specifying offsets, simple URIs like >>>>>>>>> /kafka/topics/[topic]/[partition], use for a simple transport of bytes >>>>>>>>> and/or represent the messages as some media type (text, json, xml) >>>>>>>>> >>>>>>>>> Feedback is most welcome (in the Gist or in this thread). >>>>>>>>> >>>>>>>>> Cheers! >>>>>>>>> >>>>>>>>> -David >