Hi, Joe, Looking forward to your contribution. We plan to start working on replication in a couple weeks. I will try to update the jira with some more detailed info.
Thanks, Jun On Fri, Nov 4, 2011 at 11:53 AM, Joe Stein <crypt...@gmail.com> wrote: > Thanks for quick responses. > > Jun: KAFKA-50 looks pretty interesting. I am going to go through it in > more detail again tonight. This feature not being in yet is not going to > block me from getting going but will for wide spread use I think. Is there > any opportunity to help contribute to this? Maybe start with something > else (smaller) might be better to get my feet wet? I am going to start to > cozy up to the code this weekend.... right now "Test Starting: > testProduceAndMultiFetch(kafka.javaapi.integration.PrimitiveApiTest)" keeps > hanging but maybe it resource issue on my machine (will try it on another > machine and if still and issue will send to dev) > > Neha: I think as you said "just copy over the topic directories and start > the Kafka cluster" for now will be a sufficient approach if/when a server > dies since it sounds like another broker (Y) would get the requests when > broker (X) dies (this is an assumption and easy enough for me to test). > For the "at least" delivery guarantee I am assuming you are asking what we > would do if we get another event that maybe we already got because of a > failure? We actually deal with this type of thing a lot already (mostly > from our mobile devices). It depends on the data that is flowing (in this > paradigm I guess it would be the topic) and if N errors occurred within the > last T seconds each processor has logic what todo (even some a > system.exit())... with Kafka I would hope we would know when this happens > meaning we know an error has occurred and we might for this one row get it > again (in which case we could keep it somewhere resident to validate for Y > time period). > > /* > Joe Stein > http://www.linkedin.com/in/charmalloc > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > */ > > On Fri, Nov 4, 2011 at 12:41 PM, Neha Narkhede <neha.narkh...@gmail.com > >wrote: > > > >> i mean, the log segments > > will always be in a consistent state on disk, right? > > > > Yes. You can just copy over the topic directories and start the Kafka > > cluster > > > > Thanks, > > Neha > > > > On Fri, Nov 4, 2011 at 9:35 AM, Tim Lossen <t...@lossen.de> wrote: > > > ok, but apart from the possibility of data loss, rsync > > > in principle should work fine? i mean, the log segments > > > will always be in a consistent state on disk, right? > > > > > > > > > On 2011-11-04, at 17:18 , Neha Narkhede wrote: > > > > > >>>> for redundancy, we were planning to simply rsync the kafka > > >> message logs to a second machine periodically. or do you > > >> see any obvious problems with that? > > >> > > >> Rsync approach has several problems and could be a lossy solution. We > > >> moved away from that by replacing a legacy system with Kafka. > > >> We recommend you setup your redundant cluster using the mirroring > > >> approach, which is much more reliable and real-time than rsync. > > >> > > >> I think 0.6 has a stripped down version of mirroring, where you cannot > > >> control the mirroring for specific topics. > > >> > > >> Thanks, > > >> Neha > > >> > > >> On Fri, Nov 4, 2011 at 9:12 AM, Tim Lossen <t...@lossen.de> wrote: > > >>> > > >>> interesting. is this already available in 0.6? > > >>> > > >>> for redundancy, we were planning to simply rsync the kafka > > >>> message logs to a second machine periodically. or do you > > >>> see any obvious problems with that? > > >>> > > >>> cheers > > >>> tim > > >>> > > >>> On 2011-11-04, at 17:07 , Neha Narkhede wrote: > > >>>> We have a mirroring feature where you can setup 2 clusters, one to > be > > a > > >>>> mirror of the other. At LinkedIn, we have a production Kafka > cluster, > > and > > >>>> an analytics Kafka cluster that mirrors the production one in real > > time. We > > >>>> still haven't updated the documentation to describe this in detail. > > >>> > > >>> -- > > >>> http://tim.lossen.de > > >>> > > >>> > > >>> > > > > > > -- > > > http://tim.lossen.de > > > > > > > > > > > > > > >