Thanks! Using the trunk: svn co http://svn.apache.org/repos/asf/incubator/kafka/trunk kafka
you don't have this problem. 2011/11/7 Felix GV <fe...@mate1inc.com> > I think I've had the same bug. It's a known issue that is fixed in the > trunk. > > You should check out Kafka from the (Apache) trunk and use the hadoop > consumer provided there in the contrib directory. If I'm not mistaken, that > version is more up to date than the one you mentioned on github... > > -- > Felix > > On Monday, November 7, 2011, Raimon Bosch <raimon.bo...@gmail.com> wrote: > > Problem solved! It was a configuration issue. > > > > Trying with: > > event.count=1000 > > kafka.request.limit=1000 > > > > The mapper has stopped and it has generated a file with 1000 events. But > If > > we use kafka.request.limit=-1 is sending the same events over and over > > again that's why my hadoop-consumer couldn't stop. > > > > 2011/11/7 Raimon Bosch <raimon.bo...@gmail.com> > > > >> > >> Hi, > >> > >> I have just compiled kafka from https://github.com/kafka-dev/kafka and > >> executed the DataGenerator: > >> > >> ./run-class.sh kafka.etl.impl.DataGenerator test/test.properties > >> > >> After that I have executed the hadoop consumer: > >> > >> ./run-class.sh kafka.etl.impl.SimpleKafkaETLJob test/test.properties > >> > >> > >> The hadoop-consumer is generating a file on the specified output but it > is > >> never finishing, even if I try to generate only 1 event > >> at test/test.properties. So this file is growing and growing, my > guessing > >> is that maybe it is reading always the offset 0? > >> > >> That is my test.properties: > >> > >> # name of test topic > >> kafka.etl.topic=SimpleTestEvent5 > >> > >> # hdfs location of jars > >> hdfs.default.classpath.dir=/tmp/kafka/lib > >> > >> # number of test events to be generated > >> event.count=1 > >> > >> # hadoop id and group > >> hadoop.job.ugi=kafka,hadoop > >> > >> # kafka server uri > >> kafka.server.uri=tcp://localhost:9092 > >> > >> # hdfs location of input directory > >> input=/tmp/kafka/data > >> > >> # hdfs location of output directory > >> output=/tmp/kafka/output > >> > >> # limit the number of events to be fetched; > >> # value -1 means no limitation > >> kafka.request.limit=-1 > >> > >> # kafka parameters > >> client.buffer.size=1048576 > >> client.so.timeout=60000 > >> > >> > >> Any ideas where can I have the problem? > >> > > > > -- > -- > Felix >