[jira] [Created] (KAFKA-1766) Ecosystem docs subsection has wrong anchor

2014-11-11 Thread Kirill Zaborsky (JIRA)
Kirill Zaborsky created KAFKA-1766:
--

 Summary: Ecosystem docs subsection has wrong anchor
 Key: KAFKA-1766
 URL: https://issues.apache.org/jira/browse/KAFKA-1766
 Project: Kafka
  Issue Type: Bug
Reporter: Kirill Zaborsky
Priority: Minor


the following portion of html at http://kafka.apache.org/documentation.html 
seems to be wrong:

h3a id=upgrade1.4 Ecosystem/a/h3

it should be

h3a id=ecosystem1.4 Ecosystem/a/h3

Why don't you have Kafka docs in github also? If you had it would be trivial to 
create a PR to fix this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1753) add --decommission-broker option

2014-11-11 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206223#comment-14206223
 ] 

Dmitry Pekar edited comment on KAFKA-1753 at 11/11/14 9:58 AM:
---

[~nehanarkhede] Suppose we have a cluster [0..4] and want to remove broker 2. 
This could be archived, for instance, via:
--replace-broker 2 --broker-list 0,1,3,4
The implementation will read all topics/partitions, find those replicas that 
are hosted on broker 2 and redistribute them +/- fairly across brokers 0,1,3,4. 
This should not affect the replication factor at all.

This approach (using --replace-broker with --broker-list vs using 
--decommission-broker) gives us a wider applicability of the command. For 
instance, scenarios such as remove broker 2 and redistribute its replicas 
between only broker 0 and  4 could be implemented using this approach.



was (Author: dmitry pekar):
[~nehanarkhede] Suppose we have a cluster [0..4] and want to remove broker 2. 
This could be archived, for instance, via:
--replace-broker 2 --broker-list 0,1,3,4
The implementation will read all topics/partitions, find those replicas that 
are hosted on broker 2 and redistribute them +/- fairly across brokers 0,1,3,4

This approach (using --replace-broker with --broker-list vs using 
--decommission-broker) gives us a wider applicability of the command. For 
instance, scenarios such as remove broker 2 and redistribute its replicas 
between only broker 0 and  4 could be implemented using this approach.


 add --decommission-broker option
 

 Key: KAFKA-1753
 URL: https://issues.apache.org/jira/browse/KAFKA-1753
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-11 Thread Vladimir Tretyakov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206303#comment-14206303
 ] 

Vladimir Tretyakov commented on KAFKA-1481:
---

Hi [~junrao], 
{quote}
Vladimir Tretyakov, we could probably just add a new MBean to expose the Kafka 
version number. Any value in exposing other things like build hash and build 
timestamp?
{quote}
Version will be ok I think, but my question was also about where I can get this 
Version? Should I hardcoded this Version in MBean? What is the best place for 
such information? Maybe you already have some special property file?

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_2014-11-10_20-39-41_doc.patch, 
 KAFKA-1481_2014-11-10_21-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, 
 alternateLayout2.png, diff-for-alternate-layout1.patch, 
 diff-for-alternate-layout2.patch, originalLayout.png


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-11 Thread Vladimir Tretyakov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206333#comment-14206333
 ] 

Vladimir Tretyakov commented on KAFKA-1481:
---

Hi again, 
65. Honestly I'd prefer allBrokers=true and allTopics=true or maybe 
brokers=aggregated, topics=aggregated.
From first view user will not know what this is about:
{code}
kafka.consumer:type=FetchRequestAndResponseMetrics,name=FetchRequestRateAndTimeMs,clientId=console-consumer-50964
{code}

and only after user will see something like:

{code}
kafka.consumer:type=FetchRequestAndResponseMetrics,name=FetchRequestRateAndTimeMs,clientId=console-consumer-50964,
 broker=0
kafka.consumer:type=FetchRequestAndResponseMetrics,name=FetchRequestRateAndTimeMs,clientId=console-consumer-50964,
 broker=1
{code}

user can understand (not sure) that like without 'broker=..' is aggregated 
value, I'd prefer explicit definition.

[~jjkoshy] what do you think about:
brokers=aggregated, topics=aggregated ? In jconsole user will see 
aggregated instead of true, handy from my point of view.


 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_2014-11-10_20-39-41_doc.patch, 
 KAFKA-1481_2014-11-10_21-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, 
 alternateLayout2.png, diff-for-alternate-layout1.patch, 
 diff-for-alternate-layout2.patch, originalLayout.png


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-11 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206524#comment-14206524
 ] 

Joe Stein commented on KAFKA-1752:
--

I like the approach

--replace-broker 1 --broker-list 0
or 
--replace-broker 2 --broker-list 3,4

It makes KAFKA-1753 superfluous and operating these needs  less confusing. So 
now (with this approach), both use cases (replace and decommissioning) are 
solved in this one feature now :)  

 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1494) Failed to send messages after 3 tries.

2014-11-11 Thread Martin Tapp (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206624#comment-14206624
 ] 

Martin Tapp commented on KAFKA-1494:


Had the same problem, only worked on auto-created topics, thanks for posting 
that!

 Failed to send messages after 3 tries.
 --

 Key: KAFKA-1494
 URL: https://issues.apache.org/jira/browse/KAFKA-1494
 Project: Kafka
  Issue Type: Bug
  Components: controller, core
Affects Versions: 0.8.1.1
 Environment: Mac OS 
Reporter: darion yaphets
Assignee: Neha Narkhede

 I use default server  zookeeper config to start-up zookeeper server and 
 kafka broker on my machine to test custom message which based on proto buffer 
 . I write a client to send protobuf-message to kafka broker and source code 
 as following :
 Properties properties = new Properties();
   properties.put(serializer.class, 
 java_example.ProtoBufMessage);
   properties.put(metadata.broker.list, localhost:9092);
   ProducerConfig config = new ProducerConfig(properties);
   testBuf buffer = testBuf.newBuilder().setID(0)
   .setUrl(darion.yaphet.org).build();
   ProducerString, testBuf producer = new ProducerString, 
 testBuf(
   config);
   producer.send(new KeyedMessageString, testBuf(protobuffer, 
 buffer));
 client debug log report a exception:
 [FileSystemMoniter] INFO [main] kafka.utils.Logging$class.info(68) | 
 Disconnecting from localhost:9092
 [FileSystemMoniter] DEBUG [main] kafka.utils.Logging$class.debug(52) | 
 Successfully fetched metadata for 1 topic(s) Set(protobuffer)
 [FileSystemMoniter] WARN [main] kafka.utils.Logging$class.warn(83) | Error 
 while fetching metadata [{TopicMetadata for topic protobuffer - 
 No partition metadata for topic protobuffer due to 
 kafka.common.LeaderNotAvailableException}] for topic [protobuffer]: class 
 kafka.common.LeaderNotAvailableException 
 [FileSystemMoniter] ERROR [main] kafka.utils.Logging$class.error(97) | Failed 
 to send requests for topics protobuffer with correlation ids in [0,8]
 Exception in thread main kafka.common.FailedToSendMessageException: Failed 
 to send messages after 3 tries.
   at 
 kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90)
   at kafka.producer.Producer.send(Producer.scala:76)
   at kafka.javaapi.producer.Producer.send(Producer.scala:33)
   at java_example.ProducerExamples.main(ProducerExamples.java:26)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Question about ZookeeperConsumerConnector

2014-11-11 Thread Jun Rao
Hi, Jiangjie,

Thanks for the investigation. Yes, this seems like a real issue.

1. It doesn't seem that we need to put the shutdownCommand back into the
queue. Once an iterator receives a shutdownCommand, it will be in a Done
state and will remain in that state forever.

2. Yes, we just need to get the unique set of queues and put in a
shutdownCommand
per queue.

Jun


On Mon, Nov 10, 2014 at 7:27 PM, Becket Qin becket@gmail.com wrote:

 Hi,

 We encountered a production issue recently that Mirror Maker could not
 properly shutdown because ZookeeperConsumerConnector is blocked on
 shutdown(). After looking into the code, we found 2 issues that caused this
 problem.

 1. After consumer iterator receives the shutdownCommand, It puts the
 shutdownCommand back into the data chunk queue. Is there any reason for
 doing this?

 2. In ZookeeperConsumerConnector shutdown(), we could potentially put
 multiple shutdownCommand into the same data chunk queue, provided the
 topics are sharing the same data chunk queue in topicThreadIdAndQueues.
 (KAFKA-1764 is opened)

 In our case, we only have 1 consumer stream for all the topics, the data
 chunk queue capacity is set to 1. The execution sequence causing problem is
 as below:
 1. ZookeeperConsumerConnector shutdown() is called, it tries to put
 shutdownCommand for each queue in topicThreadIdAndQueues. Since we only
 have 1 queue, multiple shutdownCommand will be put into the queue.
 2. In sendShutdownToAllQueues(), between queue.clean() and
 queue.put(shutdownCommand), consumer iterator receives the shutdownCommand
 and put it back into the data chunk queue. After that,
 ZookeeperConsumerConnector tries to put another shutdownCommand into the
 data chunk queue but will block forever.


 The thread stack trace is as below:

 Thread-23 #58 prio=5 os_prio=0 tid=0x7ff440004800 nid=0x40a waiting
 on condition [0x7ff4f0124000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x000680b96bf0 (a
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 at

 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
 at
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
 at

 kafka.consumer.ZookeeperConsumerConnector$$anonfun$sendShutdownToAllQueues$1.apply(ZookeeperConsumerConnector.scala:262)
 at

 kafka.consumer.ZookeeperConsumerConnector$$anonfun$sendShutdownToAllQueues$1.apply(ZookeeperConsumerConnector.scala:259)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at
 scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
 at

 kafka.consumer.ZookeeperConsumerConnector.sendShutdownToAllQueues(ZookeeperConsumerConnector.scala:259)
 at

 kafka.consumer.ZookeeperConsumerConnector.liftedTree1$1(ZookeeperConsumerConnector.scala:199)
 at

 kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:192)
 - locked 0x000680dd5848 (a java.lang.Object)
 at

 kafka.tools.MirrorMaker$$anonfun$cleanShutdown$1.apply(MirrorMaker.scala:185)
 at

 kafka.tools.MirrorMaker$$anonfun$cleanShutdown$1.apply(MirrorMaker.scala:185)
 at scala.collection.immutable.List.foreach(List.scala:318)
 at kafka.tools.MirrorMaker$.cleanShutdown(MirrorMaker.scala:185)
 at kafka.tools.MirrorMaker$$anon$1.run(MirrorMaker.scala:169)


 Thanks.

 Jiangjie (Becket) Qin



Re: Question about ZookeeperConsumerConnector

2014-11-11 Thread Guozhang Wang
Thanks Jun!

Jiangjie, could you file a JIRA? Thanks.

Guozhang

On Tue, Nov 11, 2014 at 9:27 AM, Jun Rao jun...@gmail.com wrote:

 Hi, Jiangjie,

 Thanks for the investigation. Yes, this seems like a real issue.

 1. It doesn't seem that we need to put the shutdownCommand back into the
 queue. Once an iterator receives a shutdownCommand, it will be in a Done
 state and will remain in that state forever.

 2. Yes, we just need to get the unique set of queues and put in a
 shutdownCommand
 per queue.

 Jun


 On Mon, Nov 10, 2014 at 7:27 PM, Becket Qin becket@gmail.com wrote:

  Hi,
 
  We encountered a production issue recently that Mirror Maker could not
  properly shutdown because ZookeeperConsumerConnector is blocked on
  shutdown(). After looking into the code, we found 2 issues that caused
 this
  problem.
 
  1. After consumer iterator receives the shutdownCommand, It puts the
  shutdownCommand back into the data chunk queue. Is there any reason for
  doing this?
 
  2. In ZookeeperConsumerConnector shutdown(), we could potentially put
  multiple shutdownCommand into the same data chunk queue, provided the
  topics are sharing the same data chunk queue in topicThreadIdAndQueues.
  (KAFKA-1764 is opened)
 
  In our case, we only have 1 consumer stream for all the topics, the data
  chunk queue capacity is set to 1. The execution sequence causing problem
 is
  as below:
  1. ZookeeperConsumerConnector shutdown() is called, it tries to put
  shutdownCommand for each queue in topicThreadIdAndQueues. Since we only
  have 1 queue, multiple shutdownCommand will be put into the queue.
  2. In sendShutdownToAllQueues(), between queue.clean() and
  queue.put(shutdownCommand), consumer iterator receives the
 shutdownCommand
  and put it back into the data chunk queue. After that,
  ZookeeperConsumerConnector tries to put another shutdownCommand into the
  data chunk queue but will block forever.
 
 
  The thread stack trace is as below:
 
  Thread-23 #58 prio=5 os_prio=0 tid=0x7ff440004800 nid=0x40a waiting
  on condition [0x7ff4f0124000]
 java.lang.Thread.State: WAITING (parking)
  at sun.misc.Unsafe.park(Native Method)
  - parking to wait for  0x000680b96bf0 (a
  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  at
  java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  at
 
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
  at
 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
  at
 
 
 kafka.consumer.ZookeeperConsumerConnector$$anonfun$sendShutdownToAllQueues$1.apply(ZookeeperConsumerConnector.scala:262)
  at
 
 
 kafka.consumer.ZookeeperConsumerConnector$$anonfun$sendShutdownToAllQueues$1.apply(ZookeeperConsumerConnector.scala:259)
  at scala.collection.Iterator$class.foreach(Iterator.scala:727)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
  at
  scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
  at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
  at
 
 
 kafka.consumer.ZookeeperConsumerConnector.sendShutdownToAllQueues(ZookeeperConsumerConnector.scala:259)
  at
 
 
 kafka.consumer.ZookeeperConsumerConnector.liftedTree1$1(ZookeeperConsumerConnector.scala:199)
  at
 
 
 kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:192)
  - locked 0x000680dd5848 (a java.lang.Object)
  at
 
 
 kafka.tools.MirrorMaker$$anonfun$cleanShutdown$1.apply(MirrorMaker.scala:185)
  at
 
 
 kafka.tools.MirrorMaker$$anonfun$cleanShutdown$1.apply(MirrorMaker.scala:185)
  at scala.collection.immutable.List.foreach(List.scala:318)
  at kafka.tools.MirrorMaker$.cleanShutdown(MirrorMaker.scala:185)
  at kafka.tools.MirrorMaker$$anon$1.run(MirrorMaker.scala:169)
 
 
  Thanks.
 
  Jiangjie (Becket) Qin
 




-- 
-- Guozhang


[jira] [Commented] (KAFKA-1753) add --decommission-broker option

2014-11-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206698#comment-14206698
 ] 

Gwen Shapira commented on KAFKA-1753:
-

--replace-broker 1 --broker-list 2,3,4 can work.

I'm wondering about the semantics when broker list is not mentioned. Is it an 
error, or do we assume just move partitions to any broker, try to balance 
things out.

The nice thing about --decommission-broker semantics is that I can have a 30 
node cluster and decommission one without caring especially where partitions go 
to. A broker list with 29 brokers is pretty annoying :)

 add --decommission-broker option
 

 Key: KAFKA-1753
 URL: https://issues.apache.org/jira/browse/KAFKA-1753
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1765) slf4j binding conflict in slf4j-log4j12 and kafka-assembly

2014-11-11 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206702#comment-14206702
 ] 

Jun Rao commented on KAFKA-1765:


Yes, SimpleLogger was dragged in incorrectly in 0.8.0. You would have to 
explicitly remove that jar dependency if you want to use slf4j-log4j12. This 
issue has since been fixed in 0.8.1.x and after.

 slf4j binding conflict in slf4j-log4j12 and kafka-assembly
 --

 Key: KAFKA-1765
 URL: https://issues.apache.org/jira/browse/KAFKA-1765
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.8.0
 Environment: slf4j-log4j12 kafka-assembly
Reporter: shizhu, wang
Assignee: Jay Kreps
Priority: Critical

 Before our project use slf4j which binded to log4j for logging. But after 
 import 
 kafka-assembly.0.8.0.jar, log cannot work as expected. It just keep printing 
 log in console now instead of log files. Looked into kafka-assembly.0.8.0.jar 
 and find there is one SimpleLogger:
  StringBuffer buf = new StringBuffer();
 long millis = System.currentTimeMillis();
 buf.append(millis - startTime);
 buf.append( [);
 buf.append(Thread.currentThread().getName());
 buf.append(] );
 buf.append(level);
 buf.append( );
 buf.append(name);
 buf.append( - );
 buf.append(message);
 buf.append(LINE_SEPARATOR);
 System.err.print(buf.toString());
 if(t != null)
 t.printStackTrace(System.err);
 System.err.flush();
 but I don't want this SimpleLogger in kafka-assembly. Can you advise how can 
 I get rid of this reference to this class(cannot remove the jar, because is 
 necessary)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1753) add --decommission-broker option

2014-11-11 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206705#comment-14206705
 ] 

Joe Stein commented on KAFKA-1753:
--

 A broker list with 29 brokers is pretty annoying

that is a good point

so does it make sense (and what we are saying) then to have

--decommission-broker 5

in which case you don't supply the list and it spreads it around the cluster 
where it should

and (for KAFKA-1752) --replace-broker 1 --broker-list 2,3,4 for explicit actions



 add --decommission-broker option
 

 Key: KAFKA-1753
 URL: https://issues.apache.org/jira/browse/KAFKA-1753
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1753) add --decommission-broker option

2014-11-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206712#comment-14206712
 ] 

Gwen Shapira commented on KAFKA-1753:
-

Yes, I think this will be usable for most cases (at least any usage I can think 
of...)

 add --decommission-broker option
 

 Key: KAFKA-1753
 URL: https://issues.apache.org/jira/browse/KAFKA-1753
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1753) add --decommission-broker option

2014-11-11 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206845#comment-14206845
 ] 

Dmitry Pekar commented on KAFKA-1753:
-

Ok. In that case I will add --decomission-broker X with semantic exactly the 
same as --replace-broker X --broker-list all except X

 add --decommission-broker option
 

 Key: KAFKA-1753
 URL: https://issues.apache.org/jira/browse/KAFKA-1753
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1753) add --decommission-broker option

2014-11-11 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206845#comment-14206845
 ] 

Dmitry Pekar edited comment on KAFKA-1753 at 11/11/14 7:09 PM:
---

Ok. In that case I will add --decomission-broker X with semantic exactly the 
same as --replace-broker X --broker-list all except X


was (Author: dmitry pekar):
Ok. In that case I will add --decomission-broker X with semantic exactly the 
same as --replace-broker X --broker-list all except X

 add --decommission-broker option
 

 Key: KAFKA-1753
 URL: https://issues.apache.org/jira/browse/KAFKA-1753
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-224) Shouw throw exception when serializer.class if not configured for Producer

2014-11-11 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-224.

Resolution: Not a Problem

Closing. Doesn't look like there's an issue here since we have a default.

 Shouw throw exception when serializer.class if not configured for Producer
 --

 Key: KAFKA-224
 URL: https://issues.apache.org/jira/browse/KAFKA-224
 Project: Kafka
  Issue Type: Bug
  Components: config, core
Affects Versions: 0.7
Reporter: Stone Gao
Priority: Minor
  Labels: newbie

   val props = new Properties();
   props.put(zk.connect, 127.0.0.1:2181);
   props.put(producer.type, async);
   props.put(batch.size, 50)
   props.put(serializer.class, kafka.serializer.StringEncoder);
   props.put(compression.codec, 1) //gzip
   val config = new ProducerConfig(props);
 If remove the serializer.class config : props.put(serializer.class, 
 kafka.serializer.StringEncoder); The consumer-shell can no longer get the 
 messages published by producer, so it's like there's something wrong, but no 
 exception got. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-11 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206915#comment-14206915
 ] 

Joe Stein commented on KAFKA-1752:
--

[~nehanarkhede] Yes, there are two (similar) cases that I have seen requiring 
this (where without it lots of scripting happens):

1) Lets say you have a cluster of 12 nodes, and one goes away (because the 
backplane on the VM box died lets say). You don't want to be at 11 nodes but 
literally want to move everything from node 6 (say that was the one that went 
down) to node 13. Now, I know they could just in service the new broker as 
node 6 but the reality is a lot of folks use IP address for the broker.id and a 
new VM might already be launched (automatically on the VM failure) bringing 
online a new node if one fails. In this case they want to replace broker.id 
172111255 with 1721112542 (because the new vm for the new broker IP is 
172.11.125.42)

2) New hardware is also another scenario.  A new server is ordered replacing 
some old hardware node.  You want to keep your X node cluster running, you add 
the new hardware and then want to replace node Y with node X+1 without having 
to shut down any brokers (some folks run at pretty peak capacities (granted not 
a wonderful scenario but they have to operate like that).

I have not seen a scenario where this would be more than one broker that it 
spreads again but if we can support a list instead of one I don't think that 
some extra flexibility is detrimental in this case.

 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1763) validate_index_log in system tests runs remotely but uses local paths

2014-11-11 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1763:
-
Reviewer: Joel Koshy

[~jjkoshy], [~mgharat] This is to fix the system tests that broke at LI. 
Probably best to review by applying it on the LI setup. 

 validate_index_log in system tests runs remotely but uses local paths
 -

 Key: KAFKA-1763
 URL: https://issues.apache.org/jira/browse/KAFKA-1763
 Project: Kafka
  Issue Type: Bug
  Components: system tests
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Fix For: 0.8.3

 Attachments: KAFKA-1763.patch


 validate_index_log is the only validation step in the system tests that needs 
 to execute a Kafka binary and it's currently doing so remotely, like the rest 
 of the test binaries. However, this is probably incorrect since it looks like 
 logs are synced back to the driver host and in other cases are operated on 
 locally. It looks like validate_index_log mixes up local/remote paths, 
 causing an exception in DumpLogSegments:
 {quote}
 2014-11-10 12:09:57,665 - DEBUG - executing command [ssh vagrant@worker1 -o 
 'HostName 127.0.0.1' -o 'Port ' -o 'UserKnownHostsFile /dev/null' -o 
 'StrictHostKeyChecking no' -o 'PasswordAuthentication no' -o 'IdentityFile 
 /Users/ewencp/.vagrant.d/insecure_private_key' -o 'IdentitiesOnly yes' -o 
 'LogLevel FATAL'  '/opt/kafka/bin/kafka-run-class.sh 
 kafka.tools.DumpLogSegments  --file 
 /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/1294.index
  --verify-index-only 21'] (system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - Dumping 
 /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/1294.index
  (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - Exception in thread main 
 java.io.FileNotFoundException: 
 /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/1294.log
  (No such file or directory) (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at java.io.FileInputStream.open(Native 
 Method) (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 java.io.FileInputStream.init(FileInputStream.java:146) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.utils.Utils$.openChannel(Utils.scala:162) (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.log.FileMessageSet.init(FileMessageSet.scala:74) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.tools.DumpLogSegments$.kafka$tools$DumpLogSegments$$dumpIndex(DumpLogSegments.scala:108)
  (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.tools.DumpLogSegments$$anonfun$main$1.apply(DumpLogSegments.scala:80) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 kafka.tools.DumpLogSegments$$anonfun$main$1.apply(DumpLogSegments.scala:73) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 kafka.tools.DumpLogSegments$.main(DumpLogSegments.scala:73) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 kafka.tools.DumpLogSegments.main(DumpLogSegments.scala) 
 (kafka_system_test_utils)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27735: Patch for KAFKA-1173

2014-11-11 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27735/
---

(Updated Nov. 11, 2014, 9:51 p.m.)


Review request for kafka.


Bugs: KAFKA-1173
https://issues.apache.org/jira/browse/KAFKA-1173


Repository: kafka


Description (updated)
---

Add basic EC2 support, cleaner Vagrantfile, README cleanup, etc.


Better naming, hostmanager for routable VM names, vagrant-cachier to reduce 
startup cost, cleanup provisioning scripts, initial support for multiple 
zookeepers, general cleanup.


Don't sync a few directories that aren't actually required on the server.


Add generic worker node support.


Default # of workers should be 0


Add support for Zookeeper clusters.

This requires us to split up allocating VMs and provisioning because Vagrant
will run the provisioner for the first node before all nodes are allocated. This
leaves the first node running Zookeeper with unroutable peer hostnames which it,
for some reason, caches as unroutable. The cluster never properly finishes
forming since the nodes are unable to open connections to nodes booted later
than they were. The simple solution is to make sure all nodes are booted before
starting configuration so we have all the addresses and hostnames available and
routable.

Fix AWS provider commands in Vagrant README.


Addressing Joe's comments.


Diffs (updated)
-

  .gitignore 99b32a6770e3da59bc0167d77d45ca339ac3dbbd 
  README.md 9aca90664b2a80a37125775ddbdea06ba6c53644 
  Vagrantfile PRE-CREATION 
  vagrant/README.md PRE-CREATION 
  vagrant/base.sh PRE-CREATION 
  vagrant/broker.sh PRE-CREATION 
  vagrant/zk.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/27735/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava



[jira] [Updated] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-11 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1173:
-
Attachment: KAFKA-1173_2014-11-11_13:50:55.patch

 Using Vagrant to get up and running with Apache Kafka
 -

 Key: KAFKA-1173
 URL: https://issues.apache.org/jira/browse/KAFKA-1173
 Project: Kafka
  Issue Type: Improvement
Reporter: Joe Stein
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
 KAFKA-1173_2014-11-11_13:50:55.patch


 Vagrant has been getting a lot of pickup in the tech communities.  I have 
 found it very useful for development and testing and working with a few 
 clients now using it to help virtualize their environments in repeatable ways.
 Using Vagrant to get up and running.
 For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
 2) Install Virtual Box 
 [https://www.virtualbox.org/](https://www.virtualbox.org/)
 In the main kafka folder
 1) ./sbt update
 2) ./sbt package
 3) ./sbt assembly-package-dependency
 4) vagrant up
 once this is done 
 * Zookeeper will be running 192.168.50.5
 * Broker 1 on 192.168.50.10
 * Broker 2 on 192.168.50.20
 * Broker 3 on 192.168.50.30
 When you are all up and running you will be back at a command brompt.  
 If you want you can login to the machines using vagrant shh machineName but 
 you don't need to.
 You can access the brokers and zookeeper by their IP
 e.g.
 bin/kafka-console-producer.sh --broker-list 
 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
 bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
 --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27735: Patch for KAFKA-1173

2014-11-11 Thread Ewen Cheslack-Postava


 On Nov. 11, 2014, 4:40 a.m., Joe Stein wrote:
  Vagrantfile, line 30
  https://reviews.apache.org/r/27735/diff/1/?file=754605#file754605line30
 
  we should use the enviornment variable AWS_ACCESS_KEY and 
  AWS_SECRET_KEY which vagrant aws provider will read instead 
  https://github.com/mitchellh/vagrant-aws/blob/master/lib/vagrant-aws/config.rb#L257-L258
 
 Joe Stein wrote:
 Ok, I realized that you are using Vagrantfile.local for the configs so I 
 think that there is not need to ALSO then export the AWS_ACCESS_KEY and 
 AWS_SECRET_KEY which would be total overkill.  Maybe just make that a bit 
 more clear in the README for folks?

Actually, I think supporting this is a good idea so I've added it. It's trivial 
to support and many people expect this to work. I happen to dislike that 
approach because I work with multiple AWS accounts so using environment 
variables can easily lead to using the wrong account by mistake, but it works 
well for many people.


- Ewen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27735/#review60746
---


On Nov. 11, 2014, 9:51 p.m., Ewen Cheslack-Postava wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27735/
 ---
 
 (Updated Nov. 11, 2014, 9:51 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1173
 https://issues.apache.org/jira/browse/KAFKA-1173
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Add basic EC2 support, cleaner Vagrantfile, README cleanup, etc.
 
 
 Better naming, hostmanager for routable VM names, vagrant-cachier to reduce 
 startup cost, cleanup provisioning scripts, initial support for multiple 
 zookeepers, general cleanup.
 
 
 Don't sync a few directories that aren't actually required on the server.
 
 
 Add generic worker node support.
 
 
 Default # of workers should be 0
 
 
 Add support for Zookeeper clusters.
 
 This requires us to split up allocating VMs and provisioning because Vagrant
 will run the provisioner for the first node before all nodes are allocated. 
 This
 leaves the first node running Zookeeper with unroutable peer hostnames which 
 it,
 for some reason, caches as unroutable. The cluster never properly finishes
 forming since the nodes are unable to open connections to nodes booted later
 than they were. The simple solution is to make sure all nodes are booted 
 before
 starting configuration so we have all the addresses and hostnames available 
 and
 routable.
 
 Fix AWS provider commands in Vagrant README.
 
 
 Addressing Joe's comments.
 
 
 Diffs
 -
 
   .gitignore 99b32a6770e3da59bc0167d77d45ca339ac3dbbd 
   README.md 9aca90664b2a80a37125775ddbdea06ba6c53644 
   Vagrantfile PRE-CREATION 
   vagrant/README.md PRE-CREATION 
   vagrant/base.sh PRE-CREATION 
   vagrant/broker.sh PRE-CREATION 
   vagrant/zk.sh PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/27735/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Ewen Cheslack-Postava
 




[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-11 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207148#comment-14207148
 ] 

Ewen Cheslack-Postava commented on KAFKA-1173:
--

Updated reviewboard https://reviews.apache.org/r/27735/diff/
 against branch origin/trunk

 Using Vagrant to get up and running with Apache Kafka
 -

 Key: KAFKA-1173
 URL: https://issues.apache.org/jira/browse/KAFKA-1173
 Project: Kafka
  Issue Type: Improvement
Reporter: Joe Stein
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
 KAFKA-1173_2014-11-11_13:50:55.patch


 Vagrant has been getting a lot of pickup in the tech communities.  I have 
 found it very useful for development and testing and working with a few 
 clients now using it to help virtualize their environments in repeatable ways.
 Using Vagrant to get up and running.
 For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
 2) Install Virtual Box 
 [https://www.virtualbox.org/](https://www.virtualbox.org/)
 In the main kafka folder
 1) ./sbt update
 2) ./sbt package
 3) ./sbt assembly-package-dependency
 4) vagrant up
 once this is done 
 * Zookeeper will be running 192.168.50.5
 * Broker 1 on 192.168.50.10
 * Broker 2 on 192.168.50.20
 * Broker 3 on 192.168.50.30
 When you are all up and running you will be back at a command brompt.  
 If you want you can login to the machines using vagrant shh machineName but 
 you don't need to.
 You can access the brokers and zookeeper by their IP
 e.g.
 bin/kafka-console-producer.sh --broker-list 
 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
 bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
 --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27735: Patch for KAFKA-1173

2014-11-11 Thread Ewen Cheslack-Postava


 On Nov. 11, 2014, 5:06 a.m., Joe Stein wrote:
  Vagrantfile, line 40
  https://reviews.apache.org/r/27735/diff/1/?file=754605#file754605line40
 
  It might be worth adding somethign about this in the README as it might 
  be start to come up on the mailing list a bit

The security group was mentioned in the README, but I've expanded it to explain 
what ports you need to open up and that you should be careful about opening 
those ports to the world.


 On Nov. 11, 2014, 5:06 a.m., Joe Stein wrote:
  vagrant/README.md, line 31
  https://reviews.apache.org/r/27735/diff/1/?file=754606#file754606line31
 
  So, a couple things here.
  
  1) I don't think we need to call these things out in the README. They 
  can be in the comments, sure.
  
  2) I don't think that zookeeper issue is at play here though, that 
  issue is why folks run 5 zk in AWS in prod so they have HA not sure how it 
  relates to here (since we are not changing the DNS and IP)
  
  3) I do like the extra step because it allows all 3 zookeeper servers 
  to get configured first (which is required) and then turned on. 
  
  So, think we shoud just remove all 3 lines entirely

I added that note since the two steps would probably be confusing to anyone 
familiar with Vagrant. The issue with ZK has to do with DNS. It's not safe to 
start ZK until after vagrant-hostmanager has added the hostnames to /etc/hosts 
because ZK will cache the DNS lookup (which is what that JIRA is about) and 
will never be able to connect to the other nodes, and then of course Kafka 
won't be able to start up either.


 On Nov. 11, 2014, 5:06 a.m., Joe Stein wrote:
  vagrant/base.sh, line 41
  https://reviews.apache.org/r/27735/diff/1/?file=754607#file754607line41
 
  I like /opt/apache/kafka but /opt/kafka is ok too :) just nit picking

I took the path from your original patch :) Happy to change it, but unless 
there's some existing code that relies on the specific path I don't think it 
matters much. We could also just make this configurable...


 On Nov. 11, 2014, 5:06 a.m., Joe Stein wrote:
  vagrant/broker.sh, line 38
  https://reviews.apache.org/r/27735/diff/1/?file=754608#file754608line38
 
  hmmm, this is something I can see folks messing with a lot and doing 
  vagrant provision to get it right... having it outside the vm and available 
  makes sense but I hate litering the repo maybe to make sure it doesn't get 
  committed add it to the .gitignore?  config/server-* or something?

Actually this hasn't been a problem for me because Vagrant has been using rsync 
for the synced folder, both with VirtualBox and EC2. Files weren't getting 
synced back to the host. For EC2 it makes sense, and I guess the base box must 
not have virtualbox guest additions. I've added these to the .gitignore anyway 
since it doesn't hurt.


- Ewen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27735/#review60747
---


On Nov. 11, 2014, 9:51 p.m., Ewen Cheslack-Postava wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27735/
 ---
 
 (Updated Nov. 11, 2014, 9:51 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1173
 https://issues.apache.org/jira/browse/KAFKA-1173
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Add basic EC2 support, cleaner Vagrantfile, README cleanup, etc.
 
 
 Better naming, hostmanager for routable VM names, vagrant-cachier to reduce 
 startup cost, cleanup provisioning scripts, initial support for multiple 
 zookeepers, general cleanup.
 
 
 Don't sync a few directories that aren't actually required on the server.
 
 
 Add generic worker node support.
 
 
 Default # of workers should be 0
 
 
 Add support for Zookeeper clusters.
 
 This requires us to split up allocating VMs and provisioning because Vagrant
 will run the provisioner for the first node before all nodes are allocated. 
 This
 leaves the first node running Zookeeper with unroutable peer hostnames which 
 it,
 for some reason, caches as unroutable. The cluster never properly finishes
 forming since the nodes are unable to open connections to nodes booted later
 than they were. The simple solution is to make sure all nodes are booted 
 before
 starting configuration so we have all the addresses and hostnames available 
 and
 routable.
 
 Fix AWS provider commands in Vagrant README.
 
 
 Addressing Joe's comments.
 
 
 Diffs
 -
 
   .gitignore 99b32a6770e3da59bc0167d77d45ca339ac3dbbd 
   README.md 9aca90664b2a80a37125775ddbdea06ba6c53644 
   Vagrantfile PRE-CREATION 
   vagrant/README.md PRE-CREATION 
   vagrant/base.sh PRE-CREATION 
   vagrant/broker.sh PRE-CREATION 
   vagrant/zk.sh PRE-CREATION 
 
 Diff: 

Review Request 27890: Patch for KAFKA-1764

2014-11-11 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27890/
---

Review request for kafka.


Bugs: KAFKA-1764
https://issues.apache.org/jira/browse/KAFKA-1764


Repository: kafka


Description
---

fix for KAFKA-1764


Diffs
-

  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
fbc680fde21b02f11285a4f4b442987356abd17b 

Diff: https://reviews.apache.org/r/27890/diff/


Testing
---


Thanks,

Jiangjie Qin



[jira] [Commented] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.

2014-11-11 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207278#comment-14207278
 ] 

Jiangjie Qin commented on KAFKA-1764:
-

Created reviewboard https://reviews.apache.org/r/27890/diff/
 against branch origin/trunk

 ZookeeperConsumerConnector could put multiple shutdownCommand to the same 
 data chunk queue.
 ---

 Key: KAFKA-1764
 URL: https://issues.apache.org/jira/browse/KAFKA-1764
 Project: Kafka
  Issue Type: Bug
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1764.patch


 In ZookeeperConsumerConnector shutdown(), we could potentially put multiple 
 shutdownCommand into the same data chunk queue, provided the topics are 
 sharing the same data chunk queue in topicThreadIdAndQueues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.

2014-11-11 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1764:

Status: Patch Available  (was: Open)

 ZookeeperConsumerConnector could put multiple shutdownCommand to the same 
 data chunk queue.
 ---

 Key: KAFKA-1764
 URL: https://issues.apache.org/jira/browse/KAFKA-1764
 Project: Kafka
  Issue Type: Bug
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1764.patch


 In ZookeeperConsumerConnector shutdown(), we could potentially put multiple 
 shutdownCommand into the same data chunk queue, provided the topics are 
 sharing the same data chunk queue in topicThreadIdAndQueues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.

2014-11-11 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1764:

Attachment: KAFKA-1764.patch

 ZookeeperConsumerConnector could put multiple shutdownCommand to the same 
 data chunk queue.
 ---

 Key: KAFKA-1764
 URL: https://issues.apache.org/jira/browse/KAFKA-1764
 Project: Kafka
  Issue Type: Bug
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1764.patch


 In ZookeeperConsumerConnector shutdown(), we could potentially put multiple 
 shutdownCommand into the same data chunk queue, provided the topics are 
 sharing the same data chunk queue in topicThreadIdAndQueues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-11 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207546#comment-14207546
 ] 

Joe Stein commented on KAFKA-1752:
--

[~nehanarkhede] I think you make a great point. If KAFKA-1753 
--decommission-broker ID has added to it a strategy to figure out where to 
place the partitions by evenly transferring that broker's partitions then that 
solves this ticket at the same time too, agreed. 

 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Message Metadata

2014-11-11 Thread Jun Rao
Thinking about this a bit more. For adding the auditing support, I am not
sure if we need to change the message format by adding the application
tags. An alternative way to do that is to add it in the producer client.
For example, for each message payload (doesn't matter what the
serialization mechanism is) that a producer receives, the producer can just
add a header before the original payload. The header will contain all
needed fields (e.g. timestamp, host, etc) for the purpose of auditing. This
way, we don't need to change the message format and the auditing info can
be added independent of the serialization mechanism of the message. The
header can use a different serialization mechanism for better efficiency.
For example, if we use Avro to serialize the header, the encoded bytes
won't include the field names in the header. This is potentially more
efficient than representing those fields as application tags in the message
where the tags have to be explicitly store in every message.

To make it easier for the client to add and make use of this kind of
auditing support, I was imagining that we can add a ProducerFactory in the
new java client. The ProducerFactory will create an instance of Producer
based on a config property. By default, the current KafkaProducer will be
returned. However, a user can plug in a different implementation of
Producer that does auditing. For example, an implementation of an
AuditProducer.send() can take the original ProducerRecord, add the header
to the value byte array and then forward the record to an underlying
KafkaProducer. We can add a similar ConsumerFactory to the new consumer
client. If a user plugs in an implementation of the AuditingConsumer, the
consumer will then be audited automatically.

Thanks,

Jun

On Tue, Oct 21, 2014 at 4:06 PM, Guozhang Wang wangg...@gmail.com wrote:

 Hi Jun,

 Regarding 4) in your comment, after thinking it for a while I cannot come
 up a way to it along with log compaction without adding new fields into the
 current format on message set. Do you have a better way that do not require
 protocol changes?

 Guozhang

 On Mon, Oct 20, 2014 at 9:53 AM, Guozhang Wang wangg...@gmail.com wrote:

  I have updated the wiki page incorporating received comments. We can
  discuss some more details on:
 
  1. How we want to do audit? Whether we want to have in-built auditing on
  brokers or even MMs or use  an audit consumer to fetch all messages from
  just brokers.
 
  2. How we can avoid de-/re-compression on brokers and MMs with log
  compaction turned on.
 
  3. How we can resolve unclean leader election resulted data inconsistency
  with control messages.
 
  Guozhang
 
  On Sun, Oct 19, 2014 at 11:41 PM, Guozhang Wang wangg...@gmail.com
  wrote:
 
  Thanks for the detailed comments Jun! Some replies inlined.
 
  On Sun, Oct 19, 2014 at 7:42 PM, Jun Rao jun...@gmail.com wrote:
 
  Hi, Guozhang,
 
  Thanks for the writeup.
 
  A few high level comments.
 
  1. Associating (versioned) schemas to a topic can be a good thing
  overall.
  Yes, this could add a bit more management overhead in Kafka. However,
 it
  makes sure that the data format contract between a producer and a
  consumer
  is kept and managed in a central place, instead of in the application.
  The
  latter is probably easier to start with, but is likely to be brittle in
  the
  long run.
 
 
  I am actually not proposing to not support associated versioned schemas
  for topics, but to not let some core Kafka functionalities like auditing
  being depend on schemas. I think this alone can separate the schema
  management from Kafka piping management (i.e. making sure every single
  message is delivered, and within some latency, etc). Adding additional
  auditing info into an existing schema will force Kafka to be aware of
 the
  schema systems (Avro, JSON, etc).
 
 
 
  2. Auditing can be a general feature that's useful for many
 applications.
  Such a feature can be implemented by extending the low level message
  format
  with a header. However, it can also be added as part of the schema
  management. For example, you can imagine a type of audited schema that
  adds
  additional auditing info to an existing schema automatically.
 Performance
  wise, it probably doesn't make a big difference whether the auditing
 info
  is added in the message header or the schema header.
 
 
  See replies above.
 
 
  3. We talked about avoiding the overhead of decompressing in both the
  broker and the mirror maker. We probably need to think through how this
  works with auditing. In the more general case where you want to audit
  every
  message, one has to do the decompression to get the individual message,
  independent of how the auditing info is stored. This means that if we
  want
  to audit the broker directly or the consumer in mirror maker, we have
 to
  pay the decompression cost anyway. Similarly, if we want to extend
 mirror
  maker to support some customized filtering/transformation 

[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2014-11-11 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207578#comment-14207578
 ] 

Jun Rao commented on KAFKA-1684:


I am not sure if we need to use the same port. HBase supports both Kerberos and 
SSL, right? Does it use separate port? Can one enable both types of secure port 
at the same time?

 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
 Attachments: KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kafka consumer transactional support

2014-11-11 Thread Jun Rao
The transactional work is currently just a prototype. The release wiki is a
bit outdated and I just updated it.

Do you have a particular transactional use case in mind?

Thanks,

Jun

On Mon, Nov 10, 2014 at 3:23 PM, Falabella, Anthony 
anthony.falabe...@citi.com wrote:

 I've tried to search a bit for what the current status is for Kafka
 supporting consumer transactions.

 These links look to be the best indication of where things stand with that:
 http://search-hadoop.com/m/4TaT4xeqNq1


 https://cwiki.apache.org/confluence/display/KAFKA/Transactional+Messaging+in+Kafka

 Based on the release plan link it looks like the hope was to support it in
 Q2 2014:
 https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan

 Can someone let me know how far along the transactional_messaging branch
 is?  Do I have any best-effort ways to do consumer transactional support
 until that's been implemented (in case it's needed I believe I will have
 approx. 4 partitions on my topics, 2 node clusters)?

 Thanks
 Tony




[jira] [Commented] (KAFKA-1763) validate_index_log in system tests runs remotely but uses local paths

2014-11-11 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207592#comment-14207592
 ] 

Mayuresh Gharat commented on KAFKA-1763:


Yeah we will probably test out today. 

 validate_index_log in system tests runs remotely but uses local paths
 -

 Key: KAFKA-1763
 URL: https://issues.apache.org/jira/browse/KAFKA-1763
 Project: Kafka
  Issue Type: Bug
  Components: system tests
Affects Versions: 0.8.1.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
 Fix For: 0.8.3

 Attachments: KAFKA-1763.patch


 validate_index_log is the only validation step in the system tests that needs 
 to execute a Kafka binary and it's currently doing so remotely, like the rest 
 of the test binaries. However, this is probably incorrect since it looks like 
 logs are synced back to the driver host and in other cases are operated on 
 locally. It looks like validate_index_log mixes up local/remote paths, 
 causing an exception in DumpLogSegments:
 {quote}
 2014-11-10 12:09:57,665 - DEBUG - executing command [ssh vagrant@worker1 -o 
 'HostName 127.0.0.1' -o 'Port ' -o 'UserKnownHostsFile /dev/null' -o 
 'StrictHostKeyChecking no' -o 'PasswordAuthentication no' -o 'IdentityFile 
 /Users/ewencp/.vagrant.d/insecure_private_key' -o 'IdentitiesOnly yes' -o 
 'LogLevel FATAL'  '/opt/kafka/bin/kafka-run-class.sh 
 kafka.tools.DumpLogSegments  --file 
 /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/1294.index
  --verify-index-only 21'] (system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - Dumping 
 /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/1294.index
  (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - Exception in thread main 
 java.io.FileNotFoundException: 
 /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/1294.log
  (No such file or directory) (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at java.io.FileInputStream.open(Native 
 Method) (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 java.io.FileInputStream.init(FileInputStream.java:146) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.utils.Utils$.openChannel(Utils.scala:162) (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.log.FileMessageSet.init(FileMessageSet.scala:74) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.tools.DumpLogSegments$.kafka$tools$DumpLogSegments$$dumpIndex(DumpLogSegments.scala:108)
  (kafka_system_test_utils)
 2014-11-10 12:09:58,673 - DEBUG - at 
 kafka.tools.DumpLogSegments$$anonfun$main$1.apply(DumpLogSegments.scala:80) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 kafka.tools.DumpLogSegments$$anonfun$main$1.apply(DumpLogSegments.scala:73) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 kafka.tools.DumpLogSegments$.main(DumpLogSegments.scala:73) 
 (kafka_system_test_utils)
 2014-11-10 12:09:58,674 - DEBUG - at 
 kafka.tools.DumpLogSegments.main(DumpLogSegments.scala) 
 (kafka_system_test_utils)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: Kafka-trunk #327

2014-11-11 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/327/changes

Changes:

[jjkoshy] KAFKA-1742; ControllerContext removeTopic does not correctly update 
state; reviewed by Joel Koshy, Guozhang Wang and Neha Narkhede

--
[...truncated 1259 lines...]

kafka.admin.AdminTest  testPartitionReassignmentNonOverlappingReplicas FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testReassigningNonExistingPartition FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testResumePartitionReassignmentThatWasCompleted FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testPreferredReplicaJsonData FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testBasicPreferredReplicaElection FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testShutdownBroker FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testTopicConfigChange FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 

[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2014-11-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207625#comment-14207625
 ] 

Gwen Shapira commented on KAFKA-1684:
-

I don't think HBase is the right model here (since its significantly more 
complex than what we are trying to do), but lets take a look at what HBase is 
doing:

1. HBase has an optional REST API. SSL is only supported on the REST API. Not 
across every API the product exposes.
2. The REST API supports SSL on the same port as non-secure. The decision on 
whether communication is secured or not is done based on a configuration file. 
If the REST API is set to SSL, non-SSL connections are not supported. 
3. REST API does not support Kerberos authentication for users. It does take 
Kerberos tokens as a way to identify the REST Server itself to the main HBase 
servers (Region servers and masters).
4. HBase Region servers and masters support Kerberos and not SSL. Kerberos is 
supported on same port as non-secure. To the best of my understanding (based on 
this: 
http://blog.cloudera.com/blog/2012/09/understanding-user-authentication-and-authorization-in-apache-hbase/)
 if security is enabled on the server, non-authenticated clients are denied 
access.

All this is very different from what we are trying to achieve for Kafka.

Actually, the fact that we are having a difficult time finding any good model 
for what we are trying to do (supporting non-secure, SSL and Kerberos at the 
same time on the same APIs and channels) is a bit concerning. 

It is definitely possible to implement a single port for SASL and non-secure by 
adding an AUTH request to the API (based on what ZK are doing).
I don't know SSL well enough to know if we can use the same port for both SSL 
and Kerberos at the same time (i.e. without forcing SSL or Kerberos via 
configs).
I still  think 3 different ports for the 3 security modes is the simplest 
solution.


 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
 Attachments: KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: No longer supporting Java 6, if? when?

2014-11-11 Thread Gwen Shapira
Perhaps relevant:

Hadoop is moving toward dropping Java6 in next release.
https://issues.apache.org/jira/browse/HADOOP-10530


On Thu, Nov 6, 2014 at 11:03 AM, Jay Kreps jay.kr...@gmail.com wrote:

 Yeah it is a little bit silly that people are still using Java 6.

 I guess this is a tradeoff--being more conservative in our java support
 means more people can use our software, whereas upgrading gives us
 developers a better experience since we aren't stuck with ancient stuff.

 Nonetheless I would argue for being a bit conservative here. Sadly a
 shocking number of people are still using Java 6. The Kafka clients get
 embedded in applications all over the place, and likely having even one
 application not yet upgraded would block adopting the new Kafka version
 that dropped java 6 support. So unless there is something in Java 7 we
 really really want I think it might be good to hold out a bit.

 As an example we dropped java 6 support in Samza and immediately had people
 blocked by that, and unlike the Kafka clients, Samza use is pretty
 centralized.

 -Jay

 On Wed, Nov 5, 2014 at 5:32 PM, Joe Stein joe.st...@stealth.ly wrote:

  This has been coming up in a lot of projects and for other reasons too I
  wanted to kick off the discussion about if/when we end support for Java
 6.
  Besides any API we may want to use in = 7 we also compile our binaries
 for
  6 for release currently.
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /
 



[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207635#comment-14207635
 ] 

Gwen Shapira commented on KAFKA-1752:
-

So it looks like we actually want --add-broker (and transfer some load to it) 
and --decommission-broker (and transfer its load somewhere)?

This makes sense to me.

 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2014-11-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207653#comment-14207653
 ] 

Gwen Shapira commented on KAFKA-1684:
-

If we'll look at the wiki 
(https://cwiki.apache.org/confluence/display/KAFKA/Security), it looks like we 
already decided on separate port for SSL, and are debating if SASL will be on 
the same port as un-authenticated or a third port. 

So the current patch will work in this regard.

Now just 8 high-level points and 5 low-level points left to tackle :)

 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
 Attachments: KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1667) topic-level configuration not validated

2014-11-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207668#comment-14207668
 ] 

Gwen Shapira commented on KAFKA-1667:
-

btw. [~edio], you probably want to assign the JIRA to yourself, and click on 
submit patch so this will be more visible as something that needs review + 
commit.

  topic-level configuration not validated
 

 Key: KAFKA-1667
 URL: https://issues.apache.org/jira/browse/KAFKA-1667
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, 
 KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch


 I was able to set the configuration for a topic to these invalid values:
 {code}
 Topic:topic-config-test  PartitionCount:1ReplicationFactor:2 
 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol
 {code}
 It seems that the values are saved as long as they are the correct type, but 
 are not validated like the corresponding broker-level properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1694) kafka command line shell

2014-11-11 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1694:
-
Description: 


https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements

  was:
There are a lot of different kafka tools.  Right now I think 5-6 of them are 
being used commonly.

I was thinking we could start by taking 
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools and 
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools and 
exposing them in a plugin type way for a command line shell interface.

It would be both: command line and shell.

so 

kafka -b brokerlist -a reasign-partition status

would run from the cli and

kafka shell -b brokerlist
kafkadescribe;
... kafka-topics.sh --describe
kafkaset topic_security['pci','profile','dss'] = true
...etc

An important item is that folks that are using existing tools we should have an 
easy api type way so they can keep doing that and get benefit too.

This interface should also have some monitoring stats too, e.g visualize in 
text the consumer lag trending between offset committed and log end offset.

kafkause topic name;
kafkastats;



 kafka command line shell
 

 Key: KAFKA-1694
 URL: https://issues.apache.org/jira/browse/KAFKA-1694
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Priority: Critical
 Fix For: 0.8.3


 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1694) kafka command line and centralized operations

2014-11-11 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1694:
-
Summary: kafka command line and centralized operations  (was: kafka command 
line shell)

 kafka command line and centralized operations
 -

 Key: KAFKA-1694
 URL: https://issues.apache.org/jira/browse/KAFKA-1694
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Priority: Critical
 Fix For: 0.8.3


 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kafka Command Line Shell

2014-11-11 Thread Joe Stein
I started writing this up on the wiki
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements

Instead of starting a new thread I figure just continue this one I started.
I also added another (important) component for centralized management of
configuration as global level much like we have topic level.  These
global configuration would be overridden (perhaps not all) from the
server.properties on start (so like in case one broker needs a different
port, sure).

One concern I have is that using RQ/RP wire protocol to the
controller instead of the current way (via ZK admin path) may expose
concurrency on the admin requests, which may not be supported yet.

Guozhang, take a look at the diagram how I am thinking of this it would be
a new handle request that will execute the tools pretty much how they are
today. My thinking is maybe to-do one at a time (so TopicCommand first I
think) and have what the TopicCommand is doing happen on server and send
the RQ/RP to the client but execute on the server. If there is something
not supported we will of course have to deal with that and implement it for
sure.  Once we get one working end to end I think adding the rest will be
(more or less) concise iterations to get it done. I added your concern to
the wiki under the gotchas section.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/

On Mon, Oct 20, 2014 at 2:15 AM, Guozhang Wang wangg...@gmail.com wrote:

 One concern I have is that using RQ/RP wire protocol to the controller
 instead of the current way (via ZK admin path) may expose concurrency on
 the admin requests, which may not be supported yet.

 Some initial discussion about this is on KAFKA-1305.

 Guozhang

 On Sun, Oct 19, 2014 at 1:55 PM, Joe Stein joe.st...@stealth.ly wrote:

  Maybe we should add some AdminMessage RQ/RP wire protocol structure(s)
 and
  let the controller handle it? We could then build the CLI and Shell in
 the
  project both as useful tools and samples for others.
 
  Making a http interface should be simple after KAFKA-1494 is done which
 all
  client libraries could offer.
 
  I will update the design tonight/tomorrow and should be able to have
  someone starting to work on it this week.
 
  /***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop
  /
  On Oct 19, 2014 1:21 PM, Harsha ka...@harsha.io wrote:
 
   +1 for Web Api
  
   On Sat, Oct 18, 2014, at 11:48 PM, Glen Mazza wrote:
Apache Karaf has been doing this for quite a few years, albeit in
 Java
not Scala.  Still, their coding approach to creating a CLI probably
captures many lessons learned over that time.
   
Glen
   
On 10/17/2014 08:03 PM, Joe Stein wrote:
 Hi, I have been thinking about the ease of use for operations with
   Kafka.
 We have lots of tools doing a lot of different things and they are
  all
   kind
 of in different places.

 So, what I was thinking is to have a single interface for our
 tooling
 https://issues.apache.org/jira/browse/KAFKA-1694

 This would manifest itself in two ways 1) a command line interface
 2)
   a repl

 We would have one entry point centrally for all Kafka commands.
 kafka CMD ARGS
 kafka createTopic --brokerList etc,
 kafka reassignPartition --brokerList etc,

 or execute and run the shell

 kafka --brokerList localhost
 kafkause topicName;
 kafkaset acl='label';

 I was thinking that all calls would be initialized through
   --brokerList and
 the broker can tell the KafkaCommandTool what server to connect to
  for
 MetaData.

 Thoughts? Tomatoes?

 /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop 
 http://www.twitter.com/allthingshadoop
 /

   
  
 



 --
 -- Guozhang