Re: topic's partition have no leader and isr

2014-07-09 Thread 鞠大升
@Jun Rao,   Kafka version: 0.8.1.1

@Guozhang Wang, I can not found the original controller log, but I can give
the controller log after execute ./bin/kafka-reassign-partitions.sh
and ./bin/kafka-preferred-replica-election.sh

Now I do not known how to recover leader for partition 25 and 31, any idea?

- controller log for ./bin/kafka-reassign-partitions.sh
---
[2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]:
Partitions reassigned listener fired for path /admin/reassign_partitions.
Record partitions to be reassigned
{version:1,partitions:[{topic:org.mobile_nginx,partition:25,replicas:[6]},{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
(kafka.controller.PartitionsReassignedListener)
[2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add Partition
triggered
{version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
for path /brokers/topics/org.mobile_nginx
(kafka.controller.PartitionStateMachine$AddPartitionsListener)
[2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]:
Partitions reassigned listener fired for path /admin/reassign_partitions.
Record partitions to be reassigned
{version:1,partitions:[{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
(kafka.controller.PartitionsReassignedListener)
[2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add Partition
triggered
{version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
for path /brokers/topics/org.mobile_nginx
(kafka.controller.PartitionStateMachine$AddPartitionsListener)

- controller log for ./bin/kafka-reassign-partitions.sh
---
[2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]:
Preferred replica election listener fired for path
/admin/preferred_replica_election. Record partitions to undergo preferred
replica election
{version:1,partitions:[{topic:org.mobile_nginx,partition:25},{topic:org.mobile_nginx,partition:31}]}
(kafka.controller.PreferredReplicaElectionListener)
[2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica
leader election for partitions [org.mobile_nginx,25],[org.mobile_nginx,31]
(kafka.controller.KafkaController)
[2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]:
Invoking state change to OnlinePartition for partitions
[org.mobile_nginx,25],[org.mobile_nginx,31]
(kafka.controller.PartitionStateMachine)
[2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]:
Current leader -1 for partition [org.mobile_nginx,25] is not the preferred
replica. Trigerring preferred replica leader election
(kafka.controller.PreferredReplicaPartitionLeaderSelector)
[2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]:
Current leader -1 for partition [org.mobile_nginx,31] is not the preferred
replica. Trigerring preferred replica leader election
(kafka.controller.PreferredReplicaPartitionLeaderSelector)
[2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
[org.mobile_nginx,25] failed to complete preferred replica leader election.
Leader is -1 (kafka.controller.KafkaController)
[2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
[org.mobile_nginx,31] failed to complete preferred replica leader election.
Leader is -1 (kafka.controller.KafkaController)


On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao jun...@gmail.com wrote:

 Also, which version of Kafka are you using?

 Thanks,

 Jun


 On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 dashen...@gmail.com wrote:

  hi, all
 
  I have a topic with 32 partitions, after some reassign operation, 2
  partitions became to no leader and isr.
 
 
 ---
  Topic:org.mobile_nginx  PartitionCount:32   ReplicationFactor:1
  Configs:
  Topic: org.mobile_nginx Partition: 0Leader: 3
 Replicas: 3
  Isr: 3
  Topic: org.mobile_nginx Partition: 1Leader: 4
 Replicas: 4
  Isr: 4
  Topic: org.mobile_nginx Partition: 2Leader: 5
 Replicas: 5
  Isr: 5
  Topic: org.mobile_nginx Partition: 3Leader: 6
 Replicas: 6
  Isr: 6
  Topic: org.mobile_nginx Partition: 4Leader: 3
 Replicas: 3
  Isr: 3
  Topic: org.mobile_nginx Partition: 5Leader: 4
 Replicas: 4
  Isr: 4
  Topic: org.mobile_nginx Partition: 6Leader: 5
 Replicas: 5
  Isr: 5
  Topic: org.mobile_nginx Partition: 7Leader: 6
 Replicas: 6
  Isr: 6
  Topic: org.mobile_nginx Partition: 8Leader: 3
 Replicas: 3
 

Re: topic's partition have no leader and isr

2014-07-09 Thread 鞠大升
I think @chenlax has encounter the same problem with me in
us...@kafka.apache.org with titile How recover leader when broker restart.
cc to us...@kafka.apache.org.


On Wed, Jul 9, 2014 at 3:10 PM, 鞠大升 dashen...@gmail.com wrote:

 @Jun Rao,   Kafka version: 0.8.1.1

 @Guozhang Wang, I can not found the original controller log, but I can
 give the controller log after execute ./bin/kafka-reassign-partitions.sh
 and ./bin/kafka-preferred-replica-election.sh

 Now I do not known how to recover leader for partition 25 and 31, any idea?

 - controller log for ./bin/kafka-reassign-partitions.sh
 ---
 [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]:
 Partitions reassigned listener fired for path /admin/reassign_partitions.
 Record partitions to be reassigned
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25,replicas:[6]},{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
 (kafka.controller.PartitionsReassignedListener)
 [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add Partition
 triggered
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
 for path /brokers/topics/org.mobile_nginx
 (kafka.controller.PartitionStateMachine$AddPartitionsListener)
 [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]:
 Partitions reassigned listener fired for path /admin/reassign_partitions.
 Record partitions to be reassigned
 {version:1,partitions:[{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
 (kafka.controller.PartitionsReassignedListener)
 [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add Partition
 triggered
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
 for path /brokers/topics/org.mobile_nginx
 (kafka.controller.PartitionStateMachine$AddPartitionsListener)

 - controller log for ./bin/kafka-reassign-partitions.sh
 ---
 [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]:
 Preferred replica election listener fired for path
 /admin/preferred_replica_election. Record partitions to undergo preferred
 replica election
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25},{topic:org.mobile_nginx,partition:31}]}
 (kafka.controller.PreferredReplicaElectionListener)
 [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica
 leader election for partitions [org.mobile_nginx,25],[org.mobile_nginx,31]
 (kafka.controller.KafkaController)
 [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]:
 Invoking state change to OnlinePartition for partitions
 [org.mobile_nginx,25],[org.mobile_nginx,31]
 (kafka.controller.PartitionStateMachine)
 [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]:
 Current leader -1 for partition [org.mobile_nginx,25] is not the preferred
 replica. Trigerring preferred replica leader election
 (kafka.controller.PreferredReplicaPartitionLeaderSelector)
 [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]:
 Current leader -1 for partition [org.mobile_nginx,31] is not the preferred
 replica. Trigerring preferred replica leader election
 (kafka.controller.PreferredReplicaPartitionLeaderSelector)
 [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
 [org.mobile_nginx,25] failed to complete preferred replica leader election.
 Leader is -1 (kafka.controller.KafkaController)
 [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
 [org.mobile_nginx,31] failed to complete preferred replica leader election.
 Leader is -1 (kafka.controller.KafkaController)


 On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao jun...@gmail.com wrote:

 Also, which version of Kafka are you using?

 Thanks,

 Jun


 On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 dashen...@gmail.com wrote:

  hi, all
 
  I have a topic with 32 partitions, after some reassign operation, 2
  partitions became to no leader and isr.
 
 
 ---
  Topic:org.mobile_nginx  PartitionCount:32   ReplicationFactor:1
  Configs:
  Topic: org.mobile_nginx Partition: 0Leader: 3
 Replicas: 3
  Isr: 3
  Topic: org.mobile_nginx Partition: 1Leader: 4
 Replicas: 4
  Isr: 4
  Topic: org.mobile_nginx Partition: 2Leader: 5
 Replicas: 5
  Isr: 5
  Topic: org.mobile_nginx Partition: 3Leader: 6
 Replicas: 6
  Isr: 6
  Topic: org.mobile_nginx Partition: 4Leader: 3
 Replicas: 3
  Isr: 3
  Topic: org.mobile_nginx Partition: 5   

[jira] [Created] (KAFKA-1531) zookeeper.connection.timeout.ms is set to 10000000 in configuration file in Kafka tarball

2014-07-09 Thread JIRA
Michał Michalski created KAFKA-1531:
---

 Summary: zookeeper.connection.timeout.ms is set to 1000 in 
configuration file in Kafka tarball
 Key: KAFKA-1531
 URL: https://issues.apache.org/jira/browse/KAFKA-1531
 Project: Kafka
  Issue Type: Bug
  Components: config
Affects Versions: 0.8.1.1, 0.8.0
Reporter: Michał Michalski


I've noticed that Kafka tarball comes with 
zookeeper.connection.timeout.ms=100 in server.properties while 
https://kafka.apache.org/08/documentation.html says the default is 6000. This 
setting was introduced in configuration file in 46b6144a, so quite a long time 
ago (3 years), so it looks intentional, but as per Jun Rao's comment on IRC, 
6000 sounds more reasonable, so that entry should probably be changed or 
removed from config at all.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1531) zookeeper.connection.timeout.ms is set to 10000000 in configuration file in Kafka tarball

2014-07-09 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KAFKA-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michał Michalski updated KAFKA-1531:


Description: I've noticed that Kafka tarball comes with 
zookeeper.connection.timeout.ms=100 in server.properties while 
https://kafka.apache.org/08/documentation.html says the default is 6000. This 
setting was introduced in configuration file in 46b6144a, so quite a long time 
ago (3 years), which makes it look intentional, but as per Jun Rao's comment on 
IRC, 6000 sounds more reasonable, so that entry should probably be changed or 
removed from config at all.  (was: I've noticed that Kafka tarball comes with 
zookeeper.connection.timeout.ms=100 in server.properties while 
https://kafka.apache.org/08/documentation.html says the default is 6000. This 
setting was introduced in configuration file in 46b6144a, so quite a long time 
ago (3 years), so it looks intentional, but as per Jun Rao's comment on IRC, 
6000 sounds more reasonable, so that entry should probably be changed or 
removed from config at all.)

 zookeeper.connection.timeout.ms is set to 1000 in configuration file in 
 Kafka tarball
 -

 Key: KAFKA-1531
 URL: https://issues.apache.org/jira/browse/KAFKA-1531
 Project: Kafka
  Issue Type: Bug
  Components: config
Affects Versions: 0.8.0, 0.8.1.1
Reporter: Michał Michalski

 I've noticed that Kafka tarball comes with 
 zookeeper.connection.timeout.ms=100 in server.properties while 
 https://kafka.apache.org/08/documentation.html says the default is 6000. This 
 setting was introduced in configuration file in 46b6144a, so quite a long 
 time ago (3 years), which makes it look intentional, but as per Jun Rao's 
 comment on IRC, 6000 sounds more reasonable, so that entry should probably be 
 changed or removed from config at all.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Stanislav Gilmulin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056049#comment-14056049
 ] 

Stanislav Gilmulin commented on KAFKA-1530:
---

Thank you,
i'd like to ask some questions.
If a cluster has a lot of nodes and a few of them are lagging or down, we can't 
guarantee we would stop and restart nodes properly and in the right order  
Is there any recommended way to manage it?
Or even an already existing script or tool for it?
Replication factor = 3.
Version 0.8.1.1

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1414) Speedup broker startup after hard reset

2014-07-09 Thread Alexey Ozeritskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Ozeritskiy updated KAFKA-1414:
-

Attachment: parallel-dir-loading-trunk-threadpool.patch

 Speedup broker startup after hard reset
 ---

 Key: KAFKA-1414
 URL: https://issues.apache.org/jira/browse/KAFKA-1414
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2, 0.9.0, 0.8.1.1
Reporter: Dmitry Bugaychenko
Assignee: Jay Kreps
 Attachments: parallel-dir-loading-0.8.patch, 
 parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch


 After hard reset due to power failure broker takes way too much time 
 recovering unflushed segments in a single thread. This could be easiliy 
 improved launching multiple threads (one per data dirrectory, assuming that 
 typically each data directory is on a dedicated drive). Localy we trie this 
 simple patch to LogManager.loadLogs and it seems to work, however I'm too new 
 to scala, so do not take it literally:
 {code}
   /**
* Recover and load all logs in the given data directories
*/
   private def loadLogs(dirs: Seq[File]) {
 val threads : Array[Thread] = new Array[Thread](dirs.size)
 var i: Int = 0
 val me = this
 for(dir - dirs) {
   val thread = new Thread( new Runnable {
 def run()
 {
   val recoveryPoints = me.recoveryPointCheckpoints(dir).read
   /* load the logs */
   val subDirs = dir.listFiles()
   if(subDirs != null) {
 val cleanShutDownFile = new File(dir, Log.CleanShutdownFile)
 if(cleanShutDownFile.exists())
   info(Found clean shutdown file. Skipping recovery for all logs 
 in data directory '%s'.format(dir.getAbsolutePath))
 for(dir - subDirs) {
   if(dir.isDirectory) {
 info(Loading log ' + dir.getName + ')
 val topicPartition = Log.parseTopicPartitionName(dir.getName)
 val config = topicConfigs.getOrElse(topicPartition.topic, 
 defaultConfig)
 val log = new Log(dir,
   config,
   recoveryPoints.getOrElse(topicPartition, 0L),
   scheduler,
   time)
 val previous = addLogWithLock(topicPartition, log)
 if(previous != null)
   throw new IllegalArgumentException(Duplicate log 
 directories found: %s, %s!.format(log.dir.getAbsolutePath, 
 previous.dir.getAbsolutePath))
   }
 }
 cleanShutDownFile.delete()
   }
 }
   })
   thread.start()
   threads(i) = thread
   i = i + 1
 }
 for(thread - threads) {
   thread.join()
 }
   }
   def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = {
 logCreationOrDeletionLock synchronized {
   this.logs.put(topicPartition, log)
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1406) Fix scaladoc/javadoc warnings

2014-07-09 Thread Alan Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056147#comment-14056147
 ] 

Alan Lee commented on KAFKA-1406:
-

Would it be okay if I work on this?

 Fix scaladoc/javadoc warnings
 -

 Key: KAFKA-1406
 URL: https://issues.apache.org/jira/browse/KAFKA-1406
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Reporter: Joel Koshy
  Labels: build
 Fix For: 0.8.2


 ./gradlew docsJarAll
 You will see a bunch of warnings mainly due to typos/incorrect use of 
 javadoc/scaladoc



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-1325:
---

Attachment: (was: KAFKA-1325.patch)

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: topic's partition have no leader and isr

2014-07-09 Thread Jun Rao
It's weird that you have replication factor 1, but two  of the partitions
25 and 31 have 2 assigned replicas. What's the command you used for
reassignment?

Thanks,

Jun


On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 dashen...@gmail.com wrote:

 @Jun Rao,   Kafka version: 0.8.1.1

 @Guozhang Wang, I can not found the original controller log, but I can give
 the controller log after execute ./bin/kafka-reassign-partitions.sh
 and ./bin/kafka-preferred-replica-election.sh

 Now I do not known how to recover leader for partition 25 and 31, any idea?

 - controller log for ./bin/kafka-reassign-partitions.sh
 ---
 [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]:
 Partitions reassigned listener fired for path /admin/reassign_partitions.
 Record partitions to be reassigned

 {version:1,partitions:[{topic:org.mobile_nginx,partition:25,replicas:[6]},{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
 (kafka.controller.PartitionsReassignedListener)
 [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add Partition
 triggered

 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
 for path /brokers/topics/org.mobile_nginx
 (kafka.controller.PartitionStateMachine$AddPartitionsListener)
 [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]:
 Partitions reassigned listener fired for path /admin/reassign_partitions.
 Record partitions to be reassigned

 {version:1,partitions:[{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
 (kafka.controller.PartitionsReassignedListener)
 [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add Partition
 triggered

 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
 for path /brokers/topics/org.mobile_nginx
 (kafka.controller.PartitionStateMachine$AddPartitionsListener)

 - controller log for ./bin/kafka-reassign-partitions.sh
 ---
 [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]:
 Preferred replica election listener fired for path
 /admin/preferred_replica_election. Record partitions to undergo preferred
 replica election

 {version:1,partitions:[{topic:org.mobile_nginx,partition:25},{topic:org.mobile_nginx,partition:31}]}
 (kafka.controller.PreferredReplicaElectionListener)
 [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica
 leader election for partitions [org.mobile_nginx,25],[org.mobile_nginx,31]
 (kafka.controller.KafkaController)
 [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]:
 Invoking state change to OnlinePartition for partitions
 [org.mobile_nginx,25],[org.mobile_nginx,31]
 (kafka.controller.PartitionStateMachine)
 [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]:
 Current leader -1 for partition [org.mobile_nginx,25] is not the preferred
 replica. Trigerring preferred replica leader election
 (kafka.controller.PreferredReplicaPartitionLeaderSelector)
 [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]:
 Current leader -1 for partition [org.mobile_nginx,31] is not the preferred
 replica. Trigerring preferred replica leader election
 (kafka.controller.PreferredReplicaPartitionLeaderSelector)
 [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
 [org.mobile_nginx,25] failed to complete preferred replica leader election.
 Leader is -1 (kafka.controller.KafkaController)
 [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
 [org.mobile_nginx,31] failed to complete preferred replica leader election.
 Leader is -1 (kafka.controller.KafkaController)


 On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao jun...@gmail.com wrote:

  Also, which version of Kafka are you using?
 
  Thanks,
 
  Jun
 
 
  On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 dashen...@gmail.com wrote:
 
   hi, all
  
   I have a topic with 32 partitions, after some reassign operation, 2
   partitions became to no leader and isr.
  
  
 
 ---
   Topic:org.mobile_nginx  PartitionCount:32   ReplicationFactor:1
   Configs:
   Topic: org.mobile_nginx Partition: 0Leader: 3
  Replicas: 3
   Isr: 3
   Topic: org.mobile_nginx Partition: 1Leader: 4
  Replicas: 4
   Isr: 4
   Topic: org.mobile_nginx Partition: 2Leader: 5
  Replicas: 5
   Isr: 5
   Topic: org.mobile_nginx Partition: 3Leader: 6
  Replicas: 6
   Isr: 6
   Topic: org.mobile_nginx Partition: 4Leader: 3
  Replicas: 3
   Isr: 3
   

Review Request 23362: Patch for KAFKA-1325

2014-07-09 Thread Manikumar Reddy O

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23362/
---

Review request for kafka.


Bugs: KAFKA-1325
https://issues.apache.org/jira/browse/KAFKA-1325


Repository: kafka


Description
---

KAFKA-1325 Changed per-topic configs segment.ms - segment.hours, retention.ms 
- retention.minutes to be consistent with the server defaults


Diffs
-

  core/src/main/scala/kafka/log/LogConfig.scala 
5746ad4767589594f904aa085131dd95e56d72bb 
  core/src/test/scala/unit/kafka/admin/AdminTest.scala 
e28979827110dfbbb92fe5b152e7f1cc973de400 

Diff: https://reviews.apache.org/r/23362/diff/


Testing
---


Thanks,

Manikumar Reddy O



[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-1325:
---

Attachment: KAFKA-1325.patch

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch, KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1406) Fix scaladoc/javadoc warnings

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056299#comment-14056299
 ] 

Jun Rao commented on KAFKA-1406:


Alan,

That would be great. Thanks for your help.

 Fix scaladoc/javadoc warnings
 -

 Key: KAFKA-1406
 URL: https://issues.apache.org/jira/browse/KAFKA-1406
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Reporter: Joel Koshy
  Labels: build
 Fix For: 0.8.2


 ./gradlew docsJarAll
 You will see a bunch of warnings mainly due to typos/incorrect use of 
 javadoc/scaladoc



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056300#comment-14056300
 ] 

Manikumar Reddy commented on KAFKA-1325:


Created reviewboard https://reviews.apache.org/r/23362/diff/
 against branch origin/trunk

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch, KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-1325:
---

Attachment: (was: KAFKA-1325.patch)

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056319#comment-14056319
 ] 

Jun Rao commented on KAFKA-1325:


Thanks for the patch. It seems to me that ms granularity is more general. 
Perhaps we can add log.retention.ms and log.roll.ms in the server config and 
leave the per topic config unchanged. This will also make the change backward 
compatible.

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1528) Normalize all the line endings

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056323#comment-14056323
 ] 

Jun Rao commented on KAFKA-1528:


Thanks for the patch. It doesn't apply to trunk though. Are there changes other 
than adding the .gitattributes file?

 Normalize all the line endings
 --

 Key: KAFKA-1528
 URL: https://issues.apache.org/jira/browse/KAFKA-1528
 Project: Kafka
  Issue Type: Improvement
Reporter: Evgeny Vereshchagin
Priority: Trivial
 Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, 
 KAFKA-1528_2014-07-06_16:23:21.patch


 Hi!
 I add .gitattributes file and remove all '\r' from some .bat files
 See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and 
 https://help.github.com/articles/dealing-with-line-endings for explanation.
 Maybe, add https://help.github.com/articles/dealing-with-line-endings to 
 contributing/commiting guide?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056338#comment-14056338
 ] 

Jun Rao commented on KAFKA-1414:


Thanks for the patch. A few comments.

1. If there are lots of partitions in a broker, having a thread per partition 
may be too much. Perhaps we can add a new config like log.recovery.threads. It 
can default to 1 to preserve the current behavior.

2. If we hit any exception when loading the logs, the current behavior is that 
we will shut down the broker. We need to preserve this behavior when running 
log loading  in separate threads. So, we need a way to propagate the exceptions 
from those threads to the caller in KafkaServer.

 Speedup broker startup after hard reset
 ---

 Key: KAFKA-1414
 URL: https://issues.apache.org/jira/browse/KAFKA-1414
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2, 0.9.0, 0.8.1.1
Reporter: Dmitry Bugaychenko
Assignee: Jay Kreps
 Attachments: parallel-dir-loading-0.8.patch, 
 parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch


 After hard reset due to power failure broker takes way too much time 
 recovering unflushed segments in a single thread. This could be easiliy 
 improved launching multiple threads (one per data dirrectory, assuming that 
 typically each data directory is on a dedicated drive). Localy we trie this 
 simple patch to LogManager.loadLogs and it seems to work, however I'm too new 
 to scala, so do not take it literally:
 {code}
   /**
* Recover and load all logs in the given data directories
*/
   private def loadLogs(dirs: Seq[File]) {
 val threads : Array[Thread] = new Array[Thread](dirs.size)
 var i: Int = 0
 val me = this
 for(dir - dirs) {
   val thread = new Thread( new Runnable {
 def run()
 {
   val recoveryPoints = me.recoveryPointCheckpoints(dir).read
   /* load the logs */
   val subDirs = dir.listFiles()
   if(subDirs != null) {
 val cleanShutDownFile = new File(dir, Log.CleanShutdownFile)
 if(cleanShutDownFile.exists())
   info(Found clean shutdown file. Skipping recovery for all logs 
 in data directory '%s'.format(dir.getAbsolutePath))
 for(dir - subDirs) {
   if(dir.isDirectory) {
 info(Loading log ' + dir.getName + ')
 val topicPartition = Log.parseTopicPartitionName(dir.getName)
 val config = topicConfigs.getOrElse(topicPartition.topic, 
 defaultConfig)
 val log = new Log(dir,
   config,
   recoveryPoints.getOrElse(topicPartition, 0L),
   scheduler,
   time)
 val previous = addLogWithLock(topicPartition, log)
 if(previous != null)
   throw new IllegalArgumentException(Duplicate log 
 directories found: %s, %s!.format(log.dir.getAbsolutePath, 
 previous.dir.getAbsolutePath))
   }
 }
 cleanShutDownFile.delete()
   }
 }
   })
   thread.start()
   threads(i) = thread
   i = i + 1
 }
 for(thread - threads) {
   thread.join()
 }
   }
   def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = {
 logCreationOrDeletionLock synchronized {
   this.logs.put(topicPartition, log)
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1406) Fix scaladoc/javadoc warnings

2014-07-09 Thread Alan Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Lee updated KAFKA-1406:


Attachment: kafka-1406-v1.patch

I've attached a patch file for the fix.
There are still few (non variable type-argument ...) warnings during scaladoc 
build but it does not seem to be related to documentation.

 Fix scaladoc/javadoc warnings
 -

 Key: KAFKA-1406
 URL: https://issues.apache.org/jira/browse/KAFKA-1406
 Project: Kafka
  Issue Type: Bug
  Components: packaging
Reporter: Joel Koshy
  Labels: build
 Fix For: 0.8.2

 Attachments: kafka-1406-v1.patch


 ./gradlew docsJarAll
 You will see a bunch of warnings mainly due to typos/incorrect use of 
 javadoc/scaladoc



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056339#comment-14056339
 ] 

Guozhang Wang commented on KAFKA-1530:
--

Not sure I understand your question, what do you mean by we can't guarantee we 
would stop and restart nodes properly and in the right order ?

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: topic's partition have no leader and isr

2014-07-09 Thread Guozhang Wang
It seems your broker 1 is in a bad state, besides these two partitions you
also have partition 10 whose Isr/Leader is not part of the replicas list:

Topic: org.mobile_nginx Partition: 10   Leader: 2   Replicas: 1
Isr: 2

Maybe you can go to broker 1 and check its logs first.

Guozhang


On Wed, Jul 9, 2014 at 7:32 AM, Jun Rao jun...@gmail.com wrote:

 It's weird that you have replication factor 1, but two  of the partitions
 25 and 31 have 2 assigned replicas. What's the command you used for
 reassignment?

 Thanks,

 Jun


 On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 dashen...@gmail.com wrote:

  @Jun Rao,   Kafka version: 0.8.1.1
 
  @Guozhang Wang, I can not found the original controller log, but I can
 give
  the controller log after execute ./bin/kafka-reassign-partitions.sh
  and ./bin/kafka-preferred-replica-election.sh
 
  Now I do not known how to recover leader for partition 25 and 31, any
 idea?
 
  - controller log for ./bin/kafka-reassign-partitions.sh
  ---
  [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]:
  Partitions reassigned listener fired for path /admin/reassign_partitions.
  Record partitions to be reassigned
 
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25,replicas:[6]},{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
  (kafka.controller.PartitionsReassignedListener)
  [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add
 Partition
  triggered
 
 
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
  for path /brokers/topics/org.mobile_nginx
  (kafka.controller.PartitionStateMachine$AddPartitionsListener)
  [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]:
  Partitions reassigned listener fired for path /admin/reassign_partitions.
  Record partitions to be reassigned
 
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
  (kafka.controller.PartitionsReassignedListener)
  [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add
 Partition
  triggered
 
 
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
  for path /brokers/topics/org.mobile_nginx
  (kafka.controller.PartitionStateMachine$AddPartitionsListener)
 
  - controller log for ./bin/kafka-reassign-partitions.sh
  ---
  [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]:
  Preferred replica election listener fired for path
  /admin/preferred_replica_election. Record partitions to undergo preferred
  replica election
 
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25},{topic:org.mobile_nginx,partition:31}]}
  (kafka.controller.PreferredReplicaElectionListener)
  [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica
  leader election for partitions
 [org.mobile_nginx,25],[org.mobile_nginx,31]
  (kafka.controller.KafkaController)
  [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]:
  Invoking state change to OnlinePartition for partitions
  [org.mobile_nginx,25],[org.mobile_nginx,31]
  (kafka.controller.PartitionStateMachine)
  [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]:
  Current leader -1 for partition [org.mobile_nginx,25] is not the
 preferred
  replica. Trigerring preferred replica leader election
  (kafka.controller.PreferredReplicaPartitionLeaderSelector)
  [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]:
  Current leader -1 for partition [org.mobile_nginx,31] is not the
 preferred
  replica. Trigerring preferred replica leader election
  (kafka.controller.PreferredReplicaPartitionLeaderSelector)
  [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
  [org.mobile_nginx,25] failed to complete preferred replica leader
 election.
  Leader is -1 (kafka.controller.KafkaController)
  [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
  [org.mobile_nginx,31] failed to complete preferred replica leader
 election.
  Leader is -1 (kafka.controller.KafkaController)
 
 
  On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao jun...@gmail.com wrote:
 
   Also, which version of Kafka are you using?
  
   Thanks,
  
   Jun
  
  
   On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 dashen...@gmail.com wrote:
  
hi, all
   
I have a topic with 32 partitions, after some reassign operation, 2
partitions became to no leader and isr.
   
   
  
 
 ---
Topic:org.mobile_nginx  PartitionCount:32   

[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset

2014-07-09 Thread Alexey Ozeritskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056358#comment-14056358
 ] 

Alexey Ozeritskiy commented on KAFKA-1414:
--

1. Ok
2. If any thread fails we get ExecutionException at
{code}
jobs.foreach(_.get())
{code}
We can get the original exception by calling e.getCause() and rethrow it. Is 
this ok ?


 Speedup broker startup after hard reset
 ---

 Key: KAFKA-1414
 URL: https://issues.apache.org/jira/browse/KAFKA-1414
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2, 0.9.0, 0.8.1.1
Reporter: Dmitry Bugaychenko
Assignee: Jay Kreps
 Attachments: parallel-dir-loading-0.8.patch, 
 parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch


 After hard reset due to power failure broker takes way too much time 
 recovering unflushed segments in a single thread. This could be easiliy 
 improved launching multiple threads (one per data dirrectory, assuming that 
 typically each data directory is on a dedicated drive). Localy we trie this 
 simple patch to LogManager.loadLogs and it seems to work, however I'm too new 
 to scala, so do not take it literally:
 {code}
   /**
* Recover and load all logs in the given data directories
*/
   private def loadLogs(dirs: Seq[File]) {
 val threads : Array[Thread] = new Array[Thread](dirs.size)
 var i: Int = 0
 val me = this
 for(dir - dirs) {
   val thread = new Thread( new Runnable {
 def run()
 {
   val recoveryPoints = me.recoveryPointCheckpoints(dir).read
   /* load the logs */
   val subDirs = dir.listFiles()
   if(subDirs != null) {
 val cleanShutDownFile = new File(dir, Log.CleanShutdownFile)
 if(cleanShutDownFile.exists())
   info(Found clean shutdown file. Skipping recovery for all logs 
 in data directory '%s'.format(dir.getAbsolutePath))
 for(dir - subDirs) {
   if(dir.isDirectory) {
 info(Loading log ' + dir.getName + ')
 val topicPartition = Log.parseTopicPartitionName(dir.getName)
 val config = topicConfigs.getOrElse(topicPartition.topic, 
 defaultConfig)
 val log = new Log(dir,
   config,
   recoveryPoints.getOrElse(topicPartition, 0L),
   scheduler,
   time)
 val previous = addLogWithLock(topicPartition, log)
 if(previous != null)
   throw new IllegalArgumentException(Duplicate log 
 directories found: %s, %s!.format(log.dir.getAbsolutePath, 
 previous.dir.getAbsolutePath))
   }
 }
 cleanShutDownFile.delete()
   }
 }
   })
   thread.start()
   threads(i) = thread
   i = i + 1
 }
 for(thread - threads) {
   thread.join()
 }
   }
   def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = {
 logCreationOrDeletionLock synchronized {
   this.logs.put(topicPartition, log)
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 23363: Patch for KAFKA-1325

2014-07-09 Thread Manikumar Reddy O

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23363/
---

Review request for kafka.


Bugs: KAFKA-1325
https://issues.apache.org/jira/browse/KAFKA-1325


Repository: kafka


Description
---

Added log.retention.ms and log.roll.ms in the server config. If both 
log.retention.ms and log.retention.minutes are given then log.retention.ms will 
be given priority. If both log.roll.ms and log.roll.hours are given then
log.roll.ms property will be given prioroty.


Diffs
-

  core/src/main/scala/kafka/server/KafkaConfig.scala 
ef75b67b67676ae5b8931902cbc8c0c2cc72c0d3 
  core/src/main/scala/kafka/server/KafkaServer.scala 
c22e51e0412843ec993721ad3230824c0aadd2ba 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
6f4809da9968de293f365307ffd9cfe1d5c34ce0 

Diff: https://reviews.apache.org/r/23363/diff/


Testing
---


Thanks,

Manikumar Reddy O



[jira] [Updated] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-1325:
---

Attachment: KAFKA-1325.patch

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch, KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056402#comment-14056402
 ] 

Manikumar Reddy commented on KAFKA-1325:


Created reviewboard https://reviews.apache.org/r/23363/diff/
 against branch origin/trunk

 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch, KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1325) Fix inconsistent per topic log configs

2014-07-09 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056410#comment-14056410
 ] 

Manikumar Reddy commented on KAFKA-1325:


I added log.retention.ms and log.roll.ms in the server config.  
Among log.retention.ms, log.retention.minutes and log.retention.hours,  
log.retention.ms will be taken.
Between log.roll.ms and log.roll.hours,  log.roll.ms property will be taken.


 Fix inconsistent per topic log configs
 --

 Key: KAFKA-1325
 URL: https://issues.apache.org/jira/browse/KAFKA-1325
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.1
Reporter: Neha Narkhede
  Labels: usability
 Attachments: KAFKA-1325.patch, KAFKA-1325.patch


 Related thread from the user mailing list - 
 http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3Cd8f6f3857b82c4ccd8725aba6fd68cb8%40cweb01.nmdf.nhnsystem.com%3E
 Our documentation is a little confusing on the log configs. 
 The log property for retention.ms is in millis but the server default it maps 
 to is in minutes.
 Same is true for segment.ms as well. We could either improve the docs or
 change the per-topic configs to be consistent with the server defaults.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Stanislav Gilmulin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056411#comment-14056411
 ] 

Stanislav Gilmulin commented on KAFKA-1530:
---

It means we have the risks.
You're right. My question wansn't clear. Let me try to explain.

First of all, accordint to business requirements we can't stop the service. So 
we can't stop all nodes before updating.
And, as you've advised, our option would be updating step by step.  
But when we update without using the right procedure, we could lose an unknown 
amount of messages in the example case presented below. 

Let's consider this case for a example.
We have 3 replicas of one partition with 2 of them lagging behind. Then we 
restart the leader. At that very moment one of the two lagging partitions 
become a new leader. After that, when the used-to-be-leader partiton starts 
working again (and which in fact has the newest data), it truncates all the 
newest data to match with now elected leader.
This situation happens quite often when we restart a highly loaded Kafka 
cluster, so that we loose some part of our data.



 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056417#comment-14056417
 ] 

Jun Rao commented on KAFKA-1414:


2. Yes, that probably right. Could you verify that the original exception is 
indeed preserved? Also, should we use newFixedThreadPool instead of 
newScheduledThreadPool since we want to run those tasks immediately?

 Speedup broker startup after hard reset
 ---

 Key: KAFKA-1414
 URL: https://issues.apache.org/jira/browse/KAFKA-1414
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 0.8.2, 0.9.0, 0.8.1.1
Reporter: Dmitry Bugaychenko
Assignee: Jay Kreps
 Attachments: parallel-dir-loading-0.8.patch, 
 parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch


 After hard reset due to power failure broker takes way too much time 
 recovering unflushed segments in a single thread. This could be easiliy 
 improved launching multiple threads (one per data dirrectory, assuming that 
 typically each data directory is on a dedicated drive). Localy we trie this 
 simple patch to LogManager.loadLogs and it seems to work, however I'm too new 
 to scala, so do not take it literally:
 {code}
   /**
* Recover and load all logs in the given data directories
*/
   private def loadLogs(dirs: Seq[File]) {
 val threads : Array[Thread] = new Array[Thread](dirs.size)
 var i: Int = 0
 val me = this
 for(dir - dirs) {
   val thread = new Thread( new Runnable {
 def run()
 {
   val recoveryPoints = me.recoveryPointCheckpoints(dir).read
   /* load the logs */
   val subDirs = dir.listFiles()
   if(subDirs != null) {
 val cleanShutDownFile = new File(dir, Log.CleanShutdownFile)
 if(cleanShutDownFile.exists())
   info(Found clean shutdown file. Skipping recovery for all logs 
 in data directory '%s'.format(dir.getAbsolutePath))
 for(dir - subDirs) {
   if(dir.isDirectory) {
 info(Loading log ' + dir.getName + ')
 val topicPartition = Log.parseTopicPartitionName(dir.getName)
 val config = topicConfigs.getOrElse(topicPartition.topic, 
 defaultConfig)
 val log = new Log(dir,
   config,
   recoveryPoints.getOrElse(topicPartition, 0L),
   scheduler,
   time)
 val previous = addLogWithLock(topicPartition, log)
 if(previous != null)
   throw new IllegalArgumentException(Duplicate log 
 directories found: %s, %s!.format(log.dir.getAbsolutePath, 
 previous.dir.getAbsolutePath))
   }
 }
 cleanShutDownFile.delete()
   }
 }
   })
   thread.start()
   threads(i) = thread
   i = i + 1
 }
 for(thread - threads) {
   thread.join()
 }
   }
   def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = {
 logCreationOrDeletionLock synchronized {
   this.logs.put(topicPartition, log)
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Oleg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056422#comment-14056422
 ] 

Oleg commented on KAFKA-1530:
-

Another situation which happened in our production:
We have replication level 3. One out of 3 partition started lagging behind (due 
to network connectivity problems, etc.). Then while upgrading/restarting Kafka 
we restart the whole cluster. After upgrade Kafka starts electing leaders for 
each partition. It's highly likely it may elect the lagging behind partition as 
a leader. Which in result leads to truncating two other partitions. In this 
case we loose data.

So we are seeking a means of restarting/upgrading Kafka without data loose.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1507) Using GetOffsetShell against non-existent topic creates the topic unintentionally

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056424#comment-14056424
 ] 

Jun Rao commented on KAFKA-1507:


Thanks for the patch. I am currently working on KAFKA-1462, which I hope will 
standardize how we support multiple versions of a request on the server. 
Perhaps, you can wait until KAFKA-1462 is done, hopefully in a week or so.

 Using GetOffsetShell against non-existent topic creates the topic 
 unintentionally
 -

 Key: KAFKA-1507
 URL: https://issues.apache.org/jira/browse/KAFKA-1507
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
 Environment: centos
Reporter: Luke Forehand
Assignee: Sriharsha Chintalapani
Priority: Minor
  Labels: newbie
 Attachments: KAFKA-1507.patch


 A typo in using GetOffsetShell command can cause a
 topic to be created which cannot be deleted (because deletion is still in
 progress)
 ./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
 kafka10:9092,kafka11:9092,kafka12:9092,kafka13:9092 --topic typo --time 1
 ./kafka-topics.sh --zookeeper stormqa1/kafka-prod --describe --topic typo
 Topic:typo  PartitionCount:8ReplicationFactor:1 Configs:
  Topic: typo Partition: 0Leader: 10  Replicas: 10
   Isr: 10
 ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (KAFKA-1530) howto update continuously

2014-07-09 Thread Oleg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056422#comment-14056422
 ] 

Oleg edited comment on KAFKA-1530 at 7/9/14 4:33 PM:
-

Another situation which happened in our production:
We have replication level 3. One out of 3 partition started lagging behind (due 
to network connectivity problems, etc.). Then while upgrading/restarting Kafka 
we restart the whole cluster. After upgrade the first node to start becomes a 
leader. It's highly likely it may be the lagging behind partition. Which in 
result leads to truncating two other partitions. In this case we loose data.

So we are seeking a mean of restarting/upgrading Kafka without data loose.


was (Author: ovgolovin):
Another situation which happened in our production:
We have replication level 3. One out of 3 partition started lagging behind (due 
to network connectivity problems, etc.). Then while upgrading/restarting Kafka 
we restart the whole cluster. After upgrade Kafka starts electing leaders for 
each partition. It's highly likely it may elect the lagging behind partition as 
a leader. Which in result leads to truncating two other partitions. In this 
case we loose data.

So we are seeking a means of restarting/upgrading Kafka without data loose.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (KAFKA-1515) Wake-up Sender upon blocked on fetching leader metadata

2014-07-09 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps resolved KAFKA-1515.
--

Resolution: Fixed

 Wake-up Sender upon blocked on fetching leader metadata
 ---

 Key: KAFKA-1515
 URL: https://issues.apache.org/jira/browse/KAFKA-1515
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1515_2014-07-03_10:19:28.patch, 
 KAFKA-1515_2014-07-03_16:43:05.patch, KAFKA-1515_2014-07-07_10:55:58.patch, 
 KAFKA-1515_2014-07-08_11:35:59.patch


 Currently the new KafkaProducer will not wake up the sender thread upon 
 forcing metadata fetch, and hence if the sender is polling with a long 
 timeout (e.g. the metadata.age period) this wait will usually timeout and 
 fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1019) kafka-preferred-replica-election.sh will fail without clear error message if /brokers/topics/[topic]/partitions does not exist

2014-07-09 Thread Mickael Hemri (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056440#comment-14056440
 ] 

Mickael Hemri commented on KAFKA-1019:
--

Hi Jun,

We use ZK 3.4.6, can it cause this issue?
How can we check that the topic watcher is registered by the controller?

Thanks

 kafka-preferred-replica-election.sh will fail without clear error message if 
 /brokers/topics/[topic]/partitions does not exist
 --

 Key: KAFKA-1019
 URL: https://issues.apache.org/jira/browse/KAFKA-1019
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Guozhang Wang
  Labels: newbie
 Fix For: 0.8.2


 From Libo Yu:
 I tried to run kafka-preferred-replica-election.sh on our kafka cluster.
 But I got this expection:
 Failed to start preferred replica election
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /brokers/topics/uattoqaaa.default/partitions
 I checked zookeeper and there is no 
 /brokers/topics/uattoqaaa.default/partitions. All I found is
 /brokers/topics/uattoqaaa.default.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056441#comment-14056441
 ] 

Jay Kreps commented on KAFKA-1532:
--

This is a good idea, but unfortunately would break compatibility with existing 
clients. It would be good to save up enough of these fixes and do them all at 
once to minimize breakage.

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056448#comment-14056448
 ] 

Guozhang Wang commented on KAFKA-1530:
--

Hi Stanislav/Oleg,

Kafka server has a config called controlled.shutdown.enable, and when it is 
turned on, the shutting down process will first wait for all the leaders of the 
current shutting down node to migrate to other nodes before shutting down the 
server (http://kafka.apache.org/documentation.html#brokerconfigs).

For your first case, where the shutting down node is the only replica in ISR, 
the shutting down process will block until there are other nodes back in ISR 
and hence can take the partitions; for your second case where there are more 
than one node in ISR, then it is guaranteed that the leaders of the shutting 
down nodes will be moved to another ISR node.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056444#comment-14056444
 ] 

Jay Kreps commented on KAFKA-1532:
--

Also since Kafka itself fully materializes messages in memory, fixing this 
won't actually let clients send arbitrarily large messages, as the broker will 
still choke.

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1019) kafka-preferred-replica-election.sh will fail without clear error message if /brokers/topics/[topic]/partitions does not exist

2014-07-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056456#comment-14056456
 ] 

Jun Rao commented on KAFKA-1019:


I am not sure how reliable ZK 3.4.6 is. If you check the ZK admin page, it 
shows you the command of listing all watchers registered. You want to check if 
there is a child watcher on /brokers/topics.

Another thing is to avoid ZK session expiration, since it may expose some 
corner case bugs.

 kafka-preferred-replica-election.sh will fail without clear error message if 
 /brokers/topics/[topic]/partitions does not exist
 --

 Key: KAFKA-1019
 URL: https://issues.apache.org/jira/browse/KAFKA-1019
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Guozhang Wang
  Labels: newbie
 Fix For: 0.8.2


 From Libo Yu:
 I tried to run kafka-preferred-replica-election.sh on our kafka cluster.
 But I got this expection:
 Failed to start preferred replica election
 org.I0Itec.zkclient.exception.ZkNoNodeException: 
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
 NoNode for /brokers/topics/uattoqaaa.default/partitions
 I checked zookeeper and there is no 
 /brokers/topics/uattoqaaa.default/partitions. All I found is
 /brokers/topics/uattoqaaa.default.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1528) Normalize all the line endings

2014-07-09 Thread Evgeny Vereshchagin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056521#comment-14056521
 ] 

Evgeny Vereshchagin commented on KAFKA-1528:


[~junrao], what command do you run to apply patch?

git apply --stat xyz-v1.patch ?

I want to reproduce it. 
Looks like I send invalid patch.

 Normalize all the line endings
 --

 Key: KAFKA-1528
 URL: https://issues.apache.org/jira/browse/KAFKA-1528
 Project: Kafka
  Issue Type: Improvement
Reporter: Evgeny Vereshchagin
Priority: Trivial
 Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, 
 KAFKA-1528_2014-07-06_16:23:21.patch


 Hi!
 I add .gitattributes file and remove all '\r' from some .bat files
 See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and 
 https://help.github.com/articles/dealing-with-line-endings for explanation.
 Maybe, add https://help.github.com/articles/dealing-with-line-endings to 
 contributing/commiting guide?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: topic's partition have no leader and isr

2014-07-09 Thread DashengJu
that's because the topic have replication factor 2 at first, then I want to
reassign all partitions 2 replicas to 1 replicas, so all partitions changed
success except 25 and 31.

I think before the reassign operation partitions 25 and 31 became to have
no leader, cause the reassign operation failed.
2014年7月9日 PM10:33于 Jun Rao jun...@gmail.com写道:

 It's weird that you have replication factor 1, but two  of the partitions
 25 and 31 have 2 assigned replicas. What's the command you used for
 reassignment?

 Thanks,

 Jun


 On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 dashen...@gmail.com wrote:

  @Jun Rao,   Kafka version: 0.8.1.1
 
  @Guozhang Wang, I can not found the original controller log, but I can
 give
  the controller log after execute ./bin/kafka-reassign-partitions.sh
  and ./bin/kafka-preferred-replica-election.sh
 
  Now I do not known how to recover leader for partition 25 and 31, any
 idea?
 
  - controller log for ./bin/kafka-reassign-partitions.sh
  ---
  [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]:
  Partitions reassigned listener fired for path /admin/reassign_partitions.
  Record partitions to be reassigned
 
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25,replicas:[6]},{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
  (kafka.controller.PartitionsReassignedListener)
  [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add
 Partition
  triggered
 
 
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
  for path /brokers/topics/org.mobile_nginx
  (kafka.controller.PartitionStateMachine$AddPartitionsListener)
  [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]:
  Partitions reassigned listener fired for path /admin/reassign_partitions.
  Record partitions to be reassigned
 
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
  (kafka.controller.PartitionsReassignedListener)
  [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add
 Partition
  triggered
 
 
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
  for path /brokers/topics/org.mobile_nginx
  (kafka.controller.PartitionStateMachine$AddPartitionsListener)
 
  - controller log for ./bin/kafka-reassign-partitions.sh
  ---
  [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on 5]:
  Preferred replica election listener fired for path
  /admin/preferred_replica_election. Record partitions to undergo preferred
  replica election
 
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25},{topic:org.mobile_nginx,partition:31}]}
  (kafka.controller.PreferredReplicaElectionListener)
  [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred replica
  leader election for partitions
 [org.mobile_nginx,25],[org.mobile_nginx,31]
  (kafka.controller.KafkaController)
  [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller 5]:
  Invoking state change to OnlinePartition for partitions
  [org.mobile_nginx,25],[org.mobile_nginx,31]
  (kafka.controller.PartitionStateMachine)
  [2014-07-09 15:07:02,972] INFO [PreferredReplicaPartitionLeaderSelector]:
  Current leader -1 for partition [org.mobile_nginx,25] is not the
 preferred
  replica. Trigerring preferred replica leader election
  (kafka.controller.PreferredReplicaPartitionLeaderSelector)
  [2014-07-09 15:07:02,973] INFO [PreferredReplicaPartitionLeaderSelector]:
  Current leader -1 for partition [org.mobile_nginx,31] is not the
 preferred
  replica. Trigerring preferred replica leader election
  (kafka.controller.PreferredReplicaPartitionLeaderSelector)
  [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
  [org.mobile_nginx,25] failed to complete preferred replica leader
 election.
  Leader is -1 (kafka.controller.KafkaController)
  [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
  [org.mobile_nginx,31] failed to complete preferred replica leader
 election.
  Leader is -1 (kafka.controller.KafkaController)
 
 
  On Sun, Jul 6, 2014 at 11:47 PM, Jun Rao jun...@gmail.com wrote:
 
   Also, which version of Kafka are you using?
  
   Thanks,
  
   Jun
  
  
   On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 dashen...@gmail.com wrote:
  
hi, all
   
I have a topic with 32 partitions, after some reassign operation, 2
partitions became to no leader and isr.
   
   
  
 
 ---
Topic:org.mobile_nginx  PartitionCount:32   ReplicationFactor:1
   

Re: topic's partition have no leader and isr

2014-07-09 Thread DashengJu
thanks for your found, it is weird.

i checked my broker 1, it is working fine, no error or warn log.

I checked the data folder, and found replication 10's replicas, isr and
leader is actually on broker 2, works fine.
just now I execute a reassignment operation to move partition 10's replicas
to broker 2, the controller log saysPartition [org.mobile_nginx,10] to be
reassigned is already assigned to replicas 2. ignoring request for
partition reassignment. then describe the topic shows partition10's
replicas became to 2.
2014年7月9日 PM11:31于 Guozhang Wang wangg...@gmail.com写道:

 It seems your broker 1 is in a bad state, besides these two partitions you
 also have partition 10 whose Isr/Leader is not part of the replicas list:

 Topic: org.mobile_nginx Partition: 10   Leader: 2   Replicas: 1
 Isr: 2

 Maybe you can go to broker 1 and check its logs first.

 Guozhang


 On Wed, Jul 9, 2014 at 7:32 AM, Jun Rao jun...@gmail.com wrote:

  It's weird that you have replication factor 1, but two  of the partitions
  25 and 31 have 2 assigned replicas. What's the command you used for
  reassignment?
 
  Thanks,
 
  Jun
 
 
  On Wed, Jul 9, 2014 at 12:10 AM, 鞠大升 dashen...@gmail.com wrote:
 
   @Jun Rao,   Kafka version: 0.8.1.1
  
   @Guozhang Wang, I can not found the original controller log, but I can
  give
   the controller log after execute ./bin/kafka-reassign-partitions.sh
   and ./bin/kafka-preferred-replica-election.sh
  
   Now I do not known how to recover leader for partition 25 and 31, any
  idea?
  
   - controller log for ./bin/kafka-reassign-partitions.sh
   ---
   [2014-07-09 15:01:31,552] DEBUG [PartitionsReassignedListener on 5]:
   Partitions reassigned listener fired for path
 /admin/reassign_partitions.
   Record partitions to be reassigned
  
  
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25,replicas:[6]},{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
   (kafka.controller.PartitionsReassignedListener)
   [2014-07-09 15:01:31,579] INFO [AddPartitionsListener on 5]: Add
  Partition
   triggered
  
  
 
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
   for path /brokers/topics/org.mobile_nginx
   (kafka.controller.PartitionStateMachine$AddPartitionsListener)
   [2014-07-09 15:01:31,587] DEBUG [PartitionsReassignedListener on 5]:
   Partitions reassigned listener fired for path
 /admin/reassign_partitions.
   Record partitions to be reassigned
  
  
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:31,replicas:[3]}]}
   (kafka.controller.PartitionsReassignedListener)
   [2014-07-09 15:01:31,590] INFO [AddPartitionsListener on 5]: Add
  Partition
   triggered
  
  
 
 {version:1,partitions:{12:[1],8:[3],19:[5],23:[5],4:[3],15:[2],11:[2],9:[4],22:[4],26:[2],13:[2],24:[6],16:[4],5:[4],10:[1],21:[3],6:[5],1:[4],17:[5],25:[6,1],14:[4],31:[3,1],0:[3],20:[2],27:[3],2:[5],18:[6],30:[6],7:[6],29:[5],3:[6],28:[4]}}
   for path /brokers/topics/org.mobile_nginx
   (kafka.controller.PartitionStateMachine$AddPartitionsListener)
  
   - controller log for ./bin/kafka-reassign-partitions.sh
   ---
   [2014-07-09 15:07:02,968] DEBUG [PreferredReplicaElectionListener on
 5]:
   Preferred replica election listener fired for path
   /admin/preferred_replica_election. Record partitions to undergo
 preferred
   replica election
  
  
 
 {version:1,partitions:[{topic:org.mobile_nginx,partition:25},{topic:org.mobile_nginx,partition:31}]}
   (kafka.controller.PreferredReplicaElectionListener)
   [2014-07-09 15:07:02,969] INFO [Controller 5]: Starting preferred
 replica
   leader election for partitions
  [org.mobile_nginx,25],[org.mobile_nginx,31]
   (kafka.controller.KafkaController)
   [2014-07-09 15:07:02,969] INFO [Partition state machine on Controller
 5]:
   Invoking state change to OnlinePartition for partitions
   [org.mobile_nginx,25],[org.mobile_nginx,31]
   (kafka.controller.PartitionStateMachine)
   [2014-07-09 15:07:02,972] INFO
 [PreferredReplicaPartitionLeaderSelector]:
   Current leader -1 for partition [org.mobile_nginx,25] is not the
  preferred
   replica. Trigerring preferred replica leader election
   (kafka.controller.PreferredReplicaPartitionLeaderSelector)
   [2014-07-09 15:07:02,973] INFO
 [PreferredReplicaPartitionLeaderSelector]:
   Current leader -1 for partition [org.mobile_nginx,31] is not the
  preferred
   replica. Trigerring preferred replica leader election
   (kafka.controller.PreferredReplicaPartitionLeaderSelector)
   [2014-07-09 15:07:02,973] WARN [Controller 5]: Partition
   [org.mobile_nginx,25] failed to complete preferred replica leader
  election.
   Leader is -1 (kafka.controller.KafkaController)
   [2014-07-09 15:07:02,973] WARN [Controller 

[jira] [Commented] (KAFKA-1528) Normalize all the line endings

2014-07-09 Thread Evgeny Vereshchagin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056720#comment-14056720
 ] 

Evgeny Vereshchagin commented on KAFKA-1528:


kafka-patch-review.py squashes two commits to one giant diff. It's wrong for 
these changes. Maybe, format-patch and apply-patch would be better?

 Normalize all the line endings
 --

 Key: KAFKA-1528
 URL: https://issues.apache.org/jira/browse/KAFKA-1528
 Project: Kafka
  Issue Type: Improvement
Reporter: Evgeny Vereshchagin
Priority: Trivial
 Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, 
 KAFKA-1528_2014-07-06_16:23:21.patch


 Hi!
 I add .gitattributes file and remove all '\r' from some .bat files
 See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and 
 https://help.github.com/articles/dealing-with-line-endings for explanation.
 Maybe, add https://help.github.com/articles/dealing-with-line-endings to 
 contributing/commiting guide?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Julian Morrison (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056792#comment-14056792
 ] 

Julian Morrison commented on KAFKA-1532:


Is Kafka itself fully materializes messages in memory unavoidable? (I am 
assuming writing it to file buffers is obvious.)

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056802#comment-14056802
 ] 

Jay Kreps commented on KAFKA-1532:
--

It's not unavoidable, but it is pretty hard. To make this really work we have 
to be able to stream partial requests from the network layer, to the 
api/processing layer, and to the log layer. This is even more complicated 
because additional appends to the log have to be blocked until the message is 
completely written.

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056806#comment-14056806
 ] 

Jay Kreps commented on KAFKA-1532:
--

Your approach where you buffer the message with a backing file and then do the 
full write to the log would also work and might be easier. But still a pretty 
big change.

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Julian Morrison (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056816#comment-14056816
 ] 

Julian Morrison commented on KAFKA-1532:


Streaming it to a scratch file (never sync'd, unlinked after use) avoids 
blocking the log - the blocking operation becomes it's verified in the scratch 
file, copy it to the log.

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1532) Move CRC32 to AFTER the payload

2014-07-09 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056833#comment-14056833
 ] 

Jay Kreps commented on KAFKA-1532:
--

Yeah agreed. I think that is doable.

 Move CRC32 to AFTER the payload
 ---

 Key: KAFKA-1532
 URL: https://issues.apache.org/jira/browse/KAFKA-1532
 Project: Kafka
  Issue Type: Improvement
  Components: core, producer 
Reporter: Julian Morrison
Assignee: Jun Rao
Priority: Minor

 To support streaming a message of known length but unknown content, take the 
 CRC32 out of the message header and make it a message trailer. Then client 
 libraries can calculate it after streaming the message to Kafka, without 
 materializing the whole message in RAM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)