[jira] [Created] (KAFKA-184) Log retention size and file size should be a long
Log retention size and file size should be a long - Key: KAFKA-184 URL: https://issues.apache.org/jira/browse/KAFKA-184 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Priority: Minor Fix For: 0.8 Realized this in a local set up: the log.retention.size config option should be a long, or we're limited to 2GB. Also, the name can be improved to log.retention.size.bytes or Mbytes as appropriate. Same comments for log.file.size. If we rename the configs, it would be better to resolve KAFKA-181 first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-183) Expose offset vector to the consumer
Expose offset vector to the consumer Key: KAFKA-183 URL: https://issues.apache.org/jira/browse/KAFKA-183 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Assignee: Jay Kreps We should enable consumers to save their position themselves. This would be useful for consumers that need to store consumed data so they can store the data and the position together, this gives a poor man's "transactionality" since any data loss on the consumer will also rewind the position to the previous position so the two are always in sync. Two ways to do this: 1. Add an OffsetStorage interface and have the zk storage implement this. The user can override this by providing an OffsetStorage implementation of their own to change how values are stored. 2. Make commit() return the position offset vector and add a setPosition(List) method to initialize the position. Let's figure out any potential problems with this, and work out the best approach. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-182) Set a TCP connection timeout for the SimpleConsumer
Set a TCP connection timeout for the SimpleConsumer --- Key: KAFKA-182 URL: https://issues.apache.org/jira/browse/KAFKA-182 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Currently we use SocketChannel.open which I *think* can block for a long time. We should make this configurable, and we may have to create the socket in a different way to enable this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-181) Log errors for unrecognized config options
[ https://issues.apache.org/jira/browse/KAFKA-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141517#comment-13141517 ] Jay Kreps commented on KAFKA-181: - Yes, please yes. I recommend we create a Config object that wraps java.util.Properties. It should include all the random Utils helpers we have for parsing ints and stuff. Whenever a get() is called for a property string we should record that property in a set. We can add a method that intersects the requested properties with the provided properties to get unused properties. This config can be used in KafkaConfig and other configs. As a side note, there are many places where we need to be able let the user provide pluggins that implement an interface. Examples are the EventHandler and Serializer interfaces in the producer, and you could imagine us making other things such as offset storage pluggable. One requirement to make this work is that it needs to be possible for the user to set properties for their plugin. For example to create an AvroSerializer you need to be able to pass in a schema.registry.url parameter which needs to get passed through unmolested to the AvroSerializerImpl to use. To enable the config objects like KafkaConfig that parse out their options should retain the original Config instance. The general contract for pluggins should be that they must provide a constructor that takes a Config so that these configs can be passed through. > Log errors for unrecognized config options > -- > > Key: KAFKA-181 > URL: https://issues.apache.org/jira/browse/KAFKA-181 > Project: Kafka > Issue Type: Improvement > Components: core >Reporter: Joel Koshy > Fix For: 0.8 > > > Currently, unrecognized config options are silently ignored. Notably, if a > config has a typo or if a deprecated config is used, then there is no warning > issued and defaults are assumed. One can argue that the broker or a consumer > or a producer with an unrecognized config option should not even be allowed > to start up especially if defaults are silently assumed, but it would be good > to at least log an error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-181) Log errors for unrecognized config options
Log errors for unrecognized config options -- Key: KAFKA-181 URL: https://issues.apache.org/jira/browse/KAFKA-181 Project: Kafka Issue Type: Improvement Components: core Reporter: Joel Koshy Fix For: 0.8 Currently, unrecognized config options are silently ignored. Notably, if a config has a typo or if a deprecated config is used, then there is no warning issued and defaults are assumed. One can argue that the broker or a consumer or a producer with an unrecognized config option should not even be allowed to start up especially if defaults are silently assumed, but it would be good to at least log an error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: KAFKA-50 replication support and the Disruptor
There are several wait strategies. You will want to use spin lock in production environments where you should have enough CPU cores anyway. Remember, the 'real' work runs in another always running thread that also uses a spin lock to wait for more work. In dev environment or hosts that need to do lots of other stuff, you definitely need another wait strategy. Erik. Op 31-10-11 21:38, Chris Burroughs wrote: On 10/31/2011 04:23 AM, Erik van Oosten wrote: That is not the point (mostly). While you're waiting for a lock, you can't issue another IO request. Avoiding locking is worthwhile even if CPU is the bottleneck. The advantage is that you'll get lower latency and also important, less jitter. /begin{Tangent} Doesn't the Disruptor use a spin lock though? I would expect that to not play nice if sharing a core with CPU bound threads doing 'real' work. -- Erik van Oosten http://www.day-to-day-stuff.blogspot.com/
[jira] [Commented] (KAFKA-171) Kafka producer should do a single write to send message sets
[ https://issues.apache.org/jira/browse/KAFKA-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141374#comment-13141374 ] Neha Narkhede commented on KAFKA-171: - You can check it into trunk. 0.7 is going off its own branch > Kafka producer should do a single write to send message sets > > > Key: KAFKA-171 > URL: https://issues.apache.org/jira/browse/KAFKA-171 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.7, 0.8 >Reporter: Jay Kreps >Assignee: Jay Kreps > Fix For: 0.8 > > Attachments: KAFKA-171-draft.patch, KAFKA-171-v2.patch, > KAFKA-171.patch > > > From email thread: > http://mail-archives.apache.org/mod_mbox/incubator-kafka-dev/201110.mbox/%3ccafbh0q1pyuj32thbayq29e6j4wt_mrg5suusfdegwj6rmex...@mail.gmail.com%3e > > Before sending an actual message, kafka producer do send a (control) > > message of 4 bytes to the server. Kafka producer always does this action > > before send some message to the server. > I think this is because in BoundedByteBufferSend.scala we do essentially > channel.write(sizeBuffer) > channel.write(dataBuffer) > The correct solution is to use vector I/O and instead do > channel.write(Array(sizeBuffer, dataBuffer)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-171) Kafka producer should do a single write to send message sets
[ https://issues.apache.org/jira/browse/KAFKA-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141295#comment-13141295 ] Jay Kreps commented on KAFKA-171: - Cool, will clean up imports before checking in. I am going to hold off on this until after 0.7 goes out. > Kafka producer should do a single write to send message sets > > > Key: KAFKA-171 > URL: https://issues.apache.org/jira/browse/KAFKA-171 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.7, 0.8 >Reporter: Jay Kreps >Assignee: Jay Kreps > Fix For: 0.8 > > Attachments: KAFKA-171-draft.patch, KAFKA-171-v2.patch, > KAFKA-171.patch > > > From email thread: > http://mail-archives.apache.org/mod_mbox/incubator-kafka-dev/201110.mbox/%3ccafbh0q1pyuj32thbayq29e6j4wt_mrg5suusfdegwj6rmex...@mail.gmail.com%3e > > Before sending an actual message, kafka producer do send a (control) > > message of 4 bytes to the server. Kafka producer always does this action > > before send some message to the server. > I think this is because in BoundedByteBufferSend.scala we do essentially > channel.write(sizeBuffer) > channel.write(dataBuffer) > The correct solution is to use vector I/O and instead do > channel.write(Array(sizeBuffer, dataBuffer)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-171) Kafka producer should do a single write to send message sets
[ https://issues.apache.org/jira/browse/KAFKA-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141272#comment-13141272 ] Jun Rao commented on KAFKA-171: --- MessageSet has a couple of unused imports. Other than that, the patch looks good. > Kafka producer should do a single write to send message sets > > > Key: KAFKA-171 > URL: https://issues.apache.org/jira/browse/KAFKA-171 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.7, 0.8 >Reporter: Jay Kreps >Assignee: Jay Kreps > Fix For: 0.8 > > Attachments: KAFKA-171-draft.patch, KAFKA-171-v2.patch, > KAFKA-171.patch > > > From email thread: > http://mail-archives.apache.org/mod_mbox/incubator-kafka-dev/201110.mbox/%3ccafbh0q1pyuj32thbayq29e6j4wt_mrg5suusfdegwj6rmex...@mail.gmail.com%3e > > Before sending an actual message, kafka producer do send a (control) > > message of 4 bytes to the server. Kafka producer always does this action > > before send some message to the server. > I think this is because in BoundedByteBufferSend.scala we do essentially > channel.write(sizeBuffer) > channel.write(dataBuffer) > The correct solution is to use vector I/O and instead do > channel.write(Array(sizeBuffer, dataBuffer)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-180) Clean up shell scripts
[ https://issues.apache.org/jira/browse/KAFKA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141257#comment-13141257 ] Jun Rao commented on KAFKA-180: --- SimpleConsumeShell is still useful for debugging purpose. I'd like to keep the code. The script can go. > Clean up shell scripts > -- > > Key: KAFKA-180 > URL: https://issues.apache.org/jira/browse/KAFKA-180 > Project: Kafka > Issue Type: Bug >Reporter: Jay Kreps >Assignee: Jay Kreps > > Currently it is a bit of a mess: > jkreps-mn:kafka-git jkreps$ ls bin > kafka-console-consumer-log4j.properties kafka-producer-perf-test.sh > kafka-server-stop.shzookeeper-server-stop.sh > kafka-console-consumer.sh kafka-producer-shell.sh > kafka-simple-consumer-perf-test.sh zookeeper-shell.sh > kafka-console-producer.sh kafka-replay-log-producer.sh > kafka-simple-consumer-shell.sh > kafka-consumer-perf-test.sh kafka-run-class.sh > run-rat.sh > kafka-consumer-shell.sh kafka-server-start.sh > zookeeper-server-start.sh > I think all the *-shell.sh scripts and all the *-simple-perf-test.sh scripts > should die. If anyone has a use for these test classes we can keep them > around and use the via kafka-run-class, but they are clearly not made for > normal people to use. The *-shell.sh scripts are obsolete now that we have > the *-console-*.sh scripts, since these do everything the old scripts did and > more. I recommend we also delete the code for these. > I would like to change each tool so that it produces a usage line explaining > what it does when run without arguments. Currently I actually had to go read > the code to figure out what some of these are. > I would like to clean up places where the arguments are non-standard. > Argument names should be the same across all the tools. > I would also like to rename kafka-replay-log-producer.sh to > kafka-copy-topic.sh. I think this tool should also accept two zookeeper urls, > the url of the input cluster and the url of the output cluster so this tool > can be used to copy between clusters. I think we can have a --zookeeper a > --input-zookeeper and a --output-zookeeper where --zookeeper is equivalent to > setting both the input and the output zookeeper. Also confused why the > options for this list --brokerinfo which can be either a zk url or brokerlist > AND also --zookeeper which must be a zk url. > Any objections to all this? Any other gripes people have while I am in there? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira