Hdfs fSshell getmerge

2015-07-24 Thread Jan Filipiak
Hello hadoop users, I have an idea about a small feature for the getmerge tool. I recently was in the need of using the new line option -nl because the files I needed to merge simply didn't had one. I was merging all the files from one directory and unfortunately this directory also included

Re: Hdfs fSshell getmerge

2015-07-24 Thread Jan Filipiak
Sorry wrong mailing list On 24.07.2015 16:44, Jan Filipiak wrote: Hello hadoop users, I have an idea about a small feature for the getmerge tool. I recently was in the need of using the new line option -nl because the files I needed to merge simply didn't had one. I was merging all the files

Re: properducertest on multiple nodes

2015-07-24 Thread Gwen Shapira
Does topic speedx1 exist? On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi, I am trying to run 20 performance test on 10 nodes using pbsdsh. The messages will send to a 6 brokers cluster. It seems to work for a while. When I delete the test queue and rerun the

Re: New consumer - partitions auto assigned only on poll

2015-07-24 Thread Stevo Slavić
Hello Jason, Thanks for feedback. I've created ticket for this https://issues.apache.org/jira/browse/KAFKA-2359 Kind regards, Stevo Slavic. On Wed, Jul 22, 2015 at 6:18 PM, Jason Gustafson ja...@confluent.io wrote: Hey Stevo, That's a good point. I think the javadoc is pretty clear that

Re: properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
I deleted the queue and recreated it before I run the test. Things are working after restart the broker cluster, thanks! On Fri, Jul 24, 2015 at 12:06 PM, Gwen Shapira gshap...@cloudera.com wrote: Does topic speedx1 exist? On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du yuheng.du.h...@gmail.com

New consumer - offset one gets in poll is not offset one is supposed to commit

2015-07-24 Thread Stevo Slavić
Hello Apache Kafka community, Say there is only one topic with single partition and a single message on it. Result of calling a poll with new consumer will return ConsumerRecord for that message and it will have offset of 0. After processing message, current KafkaConsumer implementation expects

Log Deletion Behavior

2015-07-24 Thread JIEFU GONG
Hi all, I have a few broad questions on how log deletion works, specifically in conjunction with the log.retention.time setting. Say I published some messages to some topics when the configuration was originally set to something like log.retention.hours=168 (default). If I publish these messages

properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
Hi, I am trying to run 20 performance test on 10 nodes using pbsdsh. The messages will send to a 6 brokers cluster. It seems to work for a while. When I delete the test queue and rerun the test, the broker does not seem to process incoming messages: [yuhengd@node1739 kafka_2.10-0.8.2.1]$

deleting data automatically

2015-07-24 Thread Yuheng Du
Hi, I am testing the kafka producer performance. So I created a queue and writes a large amount of data to that queue. Is there a way to delete the data automatically after some time, say whenever the data size reaches 50GB or the retention time exceeds 10 seconds, it will be deleted so my disk

Changing error codes from the simple consumer?

2015-07-24 Thread Coolbeth, Matthew
I am having a strange interaction with the simple consumer API in Kafka 0.8.2. Running the following code: public FetchedMessageList send(String topic, int partition, long offset, int fetchSize) throws KafkaException { try { FetchedMessageList results = new

Re: Log Deletion Behavior

2015-07-24 Thread Mayuresh Gharat
No. This should not happen. At Linkedin we just use the log retention hours. Try using that. Chang e it and bounce the broker. It should work. Also looking back at the config's I am not sure why we had 3 different configs for the same property : log.retention.ms log.retention.minutes

error while high level consumer

2015-07-24 Thread Kris K
Hi, I started seeing these errors in the logs continuously when I try to bring the High Level Consumer up. Please help. ZookeeperConsumerConnector [INFO] [XXX], waiting for the partition ownership to be deleted: 1 ZookeeperConsumerConnector [INFO] [XXX], end rebalancing consumer

Re: Log Deletion Behavior

2015-07-24 Thread Mayuresh Gharat
To add on, the main thing here is you should be using only one of these properties. Thanks, Mayuresh On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Yes. It should. Do not set other retention settings. Just use the hours settings. Let me know about this

Re: Log Deletion Behavior

2015-07-24 Thread Grant Henke
Also this stackoverflow answer may help: http://stackoverflow.com/a/29672325 On Fri, Jul 24, 2015 at 9:36 PM, Grant Henke ghe...@cloudera.com wrote: I would actually suggest only using the ms versions of the retention config. Be sure to check/set all the configs below and look at the

Re: deleting data automatically

2015-07-24 Thread gharatmayuresh15
You can configure that in the Configs by setting log retention : http://kafka.apache.org/07/configuration.html Thanks, Mayuresh Sent from my iPhone On Jul 24, 2015, at 12:49 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi, I am testing the kafka producer performance. So I created a

Any tool to easily fetch a single message or a few messages starting from a given offset then exit after fetching said count of messages?

2015-07-24 Thread David Luu
Hi, I notice the kafka-console-consumer.sh script has option to fetch a max # of messages, which can be 1 or more, and then exit. Which is nice. But as a high level consumer, it's missing option to fetch from given offset other than earliest latest offsets. Is there any off the shelf tool (CLI,

Re: Log Deletion Behavior

2015-07-24 Thread JIEFU GONG
Okay, I will look into only using the hours setting, but I think that means the minimum a log can be stored is 1 hour right? I think last time I tried there Kafka failed to parse decimals. ᐧ On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Yes. It should. Do

Re: deleting data automatically

2015-07-24 Thread Ewen Cheslack-Postava
You'll want to set the log retention policy via log.retention.{ms,minutes,hours} or log.retention.bytes. If you want really aggressive collection (e.g., on the order of seconds, as you specified), you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and

Re: Log Deletion Behavior

2015-07-24 Thread JIEFU GONG
Mayuresh, thanks for your comment. I won't be able to change these settings until next Monday, but just so confirm you are saying that if I restart the brokers my logs should delete themselves with respect to the newest settings, correct? ᐧ On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat

Re: Log Deletion Behavior

2015-07-24 Thread Mayuresh Gharat
Yes. It should. Do not set other retention settings. Just use the hours settings. Let me know about this :) Thanks, Mayuresh On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote: Mayuresh, thanks for your comment. I won't be able to change these settings until next Monday,

Re: Log Deletion Behavior

2015-07-24 Thread Grant Henke
I would actually suggest only using the ms versions of the retention config. Be sure to check/set all the configs below and look at the documented explanations here http://kafka.apache.org/documentation.html#brokerconfigs. I am guessing the default log.retention.check.interval.ms of 5 minutes is