Re: hadoop-consumer never finishing

Raimon Bosch Mon, 07 Nov 2011 09:17:22 -0800

Thanks!

Using the trunk:
svn co http://svn.apache.org/repos/asf/incubator/kafka/trunk kafka


you don't have this problem.

2011/11/7 Felix GV <fe...@mate1inc.com>

> I think I've had the same bug. It's a known issue that is fixed in the
> trunk.
>
> You should check out Kafka from the (Apache) trunk and use the hadoop
> consumer provided there in the contrib directory. If I'm not mistaken, that
> version is more up to date than the one you mentioned on github...
>
> --
> Felix
>
> On Monday, November 7, 2011, Raimon Bosch <raimon.bo...@gmail.com> wrote:
> > Problem solved! It was a configuration issue.
> >
> > Trying with:
> > event.count=1000
> > kafka.request.limit=1000
> >
> > The mapper has stopped and it has generated a file with 1000 events. But
> If
> > we use kafka.request.limit=-1 is sending the same events over and over
> > again that's why my hadoop-consumer couldn't stop.
> >
> > 2011/11/7 Raimon Bosch <raimon.bo...@gmail.com>
> >
> >>
> >> Hi,
> >>
> >> I have just compiled kafka from https://github.com/kafka-dev/kafka and
> >> executed the DataGenerator:
> >>
> >> ./run-class.sh kafka.etl.impl.DataGenerator test/test.properties
> >>
> >> After that I have executed the hadoop consumer:
> >>
> >> ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties
> >>
> >>
> >> The hadoop-consumer is generating a file on the specified output but it
> is
> >> never finishing, even if I try to generate only 1 event
> >> at test/test.properties. So this file is growing and growing, my
> guessing
> >> is that maybe it is reading always the offset 0?
> >>
> >> That is my test.properties:
> >>
> >> # name of test topic
> >> kafka.etl.topic=SimpleTestEvent5
> >>
> >> # hdfs location of jars
> >> hdfs.default.classpath.dir=/tmp/kafka/lib
> >>
> >> # number of test events to be generated
> >> event.count=1
> >>
> >> # hadoop id and group
> >> hadoop.job.ugi=kafka,hadoop
> >>
> >> # kafka server uri
> >> kafka.server.uri=tcp://localhost:9092
> >>
> >> # hdfs location of input directory
> >> input=/tmp/kafka/data
> >>
> >> # hdfs location of output directory
> >> output=/tmp/kafka/output
> >>
> >> # limit the number of events to be fetched;
> >> # value -1 means no limitation
> >> kafka.request.limit=-1
> >>
> >> # kafka parameters
> >> client.buffer.size=1048576
> >> client.so.timeout=60000
> >>
> >>
> >> Any ideas where can I have the problem?
> >>
> >
>
> --
> --
> Felix
>

Re: hadoop-consumer never finishing

Reply via email to