Wang,

That did it. Thanks a lot.


- Shekar

On Thu, May 14, 2015 at 10:38 AM, Guozhang Wang <wangg...@gmail.com> wrote:

> Hi Shekar,
>
> It seems the incoming / outgoing topics are not the root of the problem
> here, but the checkpoint topic "__samza_checkpoint_ver_1_for_Argos". From
> the error logs this topic only has one replica 1018019532, which was down
> and hence not available.
>
> Guozhang
>
> On Thu, May 14, 2015 at 5:16 AM, Shekar Tippur <ctip...@gmail.com> wrote:
>
> > Here is what I see on Kafka log:
> >
> > [2015-05-14 04:11:27,752] ERROR Closing socket for /10.180.195.32
> because
> > of error (kafka.network.Processor)
> >
> > java.io.IOException: Connection reset by peer
> >
> >         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> >
> >         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> >
> >         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> >
> >         at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> >
> >         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> >
> >         at kafka.utils.Utils$.read(Utils.scala:375)
> >
> >         at
> >
> >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >
> >         at kafka.network.Processor.read(SocketServer.scala:347)
> >
> >         at kafka.network.Processor.run(SocketServer.scala:245)
> >
> >         at java.lang.Thread.run(Thread.java:745)
> >
> > [2015-05-14 04:11:27,753] INFO Closing socket connection to /
> 10.180.195.32
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:16:06,537] INFO Closing socket connection to /
> 10.180.195.32
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:16:06,604] INFO Closing socket connection to /
> 10.180.195.32
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:16:32,370] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:16:32,452] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:16:32,810] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:16:32,931] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:36:40,586] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:39:49,016] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:43:38,166] INFO Closing socket connection to /
> 10.180.195.32
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:43:38,392] INFO [ReplicaFetcherManager on broker
> 1018019533]
> > Removed fetcher for partitions [argos-parser,0],[argos-raw,0]
> > (kafka.server.ReplicaFetcherManager)
> >
> > [2015-05-14 04:43:40,746] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:43:40,855] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > [2015-05-14 04:43:40,957] INFO Closing socket connection to /
> 10.180.195.33
> > .
> > (kafka.network.Processor)
> >
> > On Thu, May 14, 2015 at 4:55 AM, Shekar Tippur <ctip...@gmail.com>
> wrote:
> >
> > > Here is the complete log:
> > >
> > > http://pastebin.com/nX7twETm
> > >
> > > Interesting, I see a leader not available exception instead of the
> > earlier
> > > one.
> > >
> > >
> ./container_1431601903660_0001_01_000002/samza-container-0.log:2015-05-14
> > > 04:53:41 BrokerPartitionInfo [WARN] Error while fetching metadata
> > partition
> > > 0 leader: none replicas: 1018019532 (sprdargas402.corp.intuit.net:6667
> )
> > isr:
> > > isUnderReplicated: true for topic partition
> > > [__samza_checkpoint_ver_1_for_Argos_1,0]: [class
> > > kafka.common.LeaderNotAvailableException]
> > >
> > > - Shekar
> > >
> > > On Wed, May 13, 2015 at 7:52 PM, Naveen S <navg...@gmail.com> wrote:
> > >
> > >> Hey Shekar,
> > >> Can you paste the entire stacktrace/log? Where there any other errors
> ?
> > >> On Wed, May 13, 2015 at 6:04 PM Shekar Tippur <ctip...@gmail.com>
> > wrote:
> > >>
> > >> > Hello,
> > >> >
> > >> > I seem to come across a issue with replication. We have 2 nodes
> where
> > >> Kafka
> > >> > and yarn run.
> > >> >
> > >> > We have enabled replication factor on Kafka (Replication factor =
> 2).
> > >> For
> > >> > testing redundancy, we shutdown broker01 server.
> > >> > On the yarn application logs, we see the
> > >> > exception kafka.common.ReplicaNotAvailableException
> > >> >
> > >> > Incoming topic:
> > >> >
> > >> > /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --topic
> raw
> > >> > --describe
> > >> >
> > >> > Topic:raw PartitionCount:1 ReplicationFactor:2 Configs:
> > >> >
> > >> > Topic: argos-raw Partition: 0 Leader: 1018019533 Replicas:
> > >> > 1018019533,1018019532 Isr: 1018019533,1018019532
> > >> >
> > >> > Out going topic:
> > >> >
> > >> >  /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --topic
> > >> parser
> > >> > --describe
> > >> >
> > >> > Topic:parser PartitionCount:1 ReplicationFactor:2 Configs:
> > >> >
> > >> >  Topic: argos-parser Partition: 0 Leader: 1018019533 Replicas:
> > >> > 1018019533,1018019532 Isr: 1018019533,1018019532
> > >> >
> > >> > Any idea on why this could be happening?
> > >> >
> > >> > - Shekar
> > >> >
> > >>
> > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to