[jira] [Commented] (KAFKA-3657) NewProducer NullPointerException on ProduceRequest

2016-05-04 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15271007#comment-15271007
 ] 

Vamsi Subhash Achanta commented on KAFKA-3657:
--

I haven't tested it against 0.9.x branch. This occurs on production 
intermittently and we are still on 0.8.2.1

> NewProducer NullPointerException on ProduceRequest
> --
>
> Key: KAFKA-3657
> URL: https://issues.apache.org/jira/browse/KAFKA-3657
> Project: Kafka
>  Issue Type: Bug
>  Components: network, producer 
>Affects Versions: 0.8.2.1
> Environment: linux 3.2.0 debian7
>Reporter: Vamsi Subhash Achanta
>Assignee: Jun Rao
>
> The producer upon send.get() on the future appends to the accumulator the 
> record batches and the Sender.java (separate thread) flushes it to the 
> server. The produce request waits on the countDownLatch in the 
> FutureRecordMetadata:
> public RecordMetadata get() throws InterruptedException, 
> ExecutionException {
> this.result.await();
> In this case, the client thread is blocked for ever (as it is get() without 
> timeout) for the response and the response upon poll by the Sender returns an 
> attachment with the batch value as null. The batch is processed and the 
> request is errored out. The Sender catches a global level exception and then 
> goes ahead. As the accumulator is drained, the response will never be 
> returned and the producer client thread calling get() is blocked for ever on 
> the latch await call.
> I checked at the server end but still haven't found the reason for null 
> batch. Any pointers on this?
> ERROR [2016-05-01 21:00:09,256] [kafka-producer-network-thread |producer-app] 
> [Sender] message_id: group_id: : Uncaught error in kafka producer I/O thread:
> ! java.lang.NullPointerException: null
> ! at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:266)
> ! at 
> org.apache.kafka.clients.producer.internals.Sender.handleResponse(Sender.java:236)
> ! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:196)
> ! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
> ! at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3657) NewProducer NullPointerException on ProduceRequest

2016-05-04 Thread Vamsi Subhash Achanta (JIRA)
Vamsi Subhash Achanta created KAFKA-3657:


 Summary: NewProducer NullPointerException on ProduceRequest
 Key: KAFKA-3657
 URL: https://issues.apache.org/jira/browse/KAFKA-3657
 Project: Kafka
  Issue Type: Bug
  Components: network, producer 
Affects Versions: 0.8.2.1
 Environment: linux 3.2.0 debian7
Reporter: Vamsi Subhash Achanta
Assignee: Jun Rao


The producer upon send.get() on the future appends to the accumulator the 
record batches and the Sender.java (separate thread) flushes it to the server. 
The produce request waits on the countDownLatch in the FutureRecordMetadata:
public RecordMetadata get() throws InterruptedException, ExecutionException 
{
this.result.await();

In this case, the client thread is blocked for ever (as it is get() without 
timeout) for the response and the response upon poll by the Sender returns an 
attachment with the batch value as null. The batch is processed and the request 
is errored out. The Sender catches a global level exception and then goes 
ahead. As the accumulator is drained, the response will never be returned and 
the producer client thread calling get() is blocked for ever on the latch await 
call.

I checked at the server end but still haven't found the reason for null batch. 
Any pointers on this?

ERROR [2016-05-01 21:00:09,256] [kafka-producer-network-thread |producer-app] 
[Sender] message_id: group_id: : Uncaught error in kafka producer I/O thread:
! java.lang.NullPointerException: null
! at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:266)
! at 
org.apache.kafka.clients.producer.internals.Sender.handleResponse(Sender.java:236)
! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:196)
! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
! at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3657) NewProducer NullPointerException on ProduceRequest

2016-05-04 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-3657:
-
Description: 
The producer upon send.get() on the future appends to the accumulator the 
record batches and the Sender.java (separate thread) flushes it to the server. 
The produce request waits on the countDownLatch in the FutureRecordMetadata:
public RecordMetadata get() throws InterruptedException, ExecutionException 
{
this.result.await();

In this case, the client thread is blocked for ever (as it is get() without 
timeout) for the response and the response upon poll by the Sender returns an 
attachment with the batch value as null. The batch is processed and the request 
is errored out. The Sender catches a global level exception and then goes 
ahead. As the accumulator is drained, the response will never be returned and 
the producer client thread calling get() is blocked for ever on the latch await 
call.

I checked at the server end but still haven't found the reason for null batch. 
Any pointers on this?


ERROR [2016-05-01 21:00:09,256] [kafka-producer-network-thread |producer-app] 
[Sender] message_id: group_id: : Uncaught error in kafka producer I/O thread:
! java.lang.NullPointerException: null
! at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:266)
! at 
org.apache.kafka.clients.producer.internals.Sender.handleResponse(Sender.java:236)
! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:196)
! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
! at java.lang.Thread.run(Thread.java:745)

  was:
The producer upon send.get() on the future appends to the accumulator the 
record batches and the Sender.java (separate thread) flushes it to the server. 
The produce request waits on the countDownLatch in the FutureRecordMetadata:
public RecordMetadata get() throws InterruptedException, ExecutionException 
{
this.result.await();

In this case, the client thread is blocked for ever (as it is get() without 
timeout) for the response and the response upon poll by the Sender returns an 
attachment with the batch value as null. The batch is processed and the request 
is errored out. The Sender catches a global level exception and then goes 
ahead. As the accumulator is drained, the response will never be returned and 
the producer client thread calling get() is blocked for ever on the latch await 
call.

I checked at the server end but still haven't found the reason for null batch. 
Any pointers on this?

ERROR [2016-05-01 21:00:09,256] [kafka-producer-network-thread |producer-app] 
[Sender] message_id: group_id: : Uncaught error in kafka producer I/O thread:
! java.lang.NullPointerException: null
! at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:266)
! at 
org.apache.kafka.clients.producer.internals.Sender.handleResponse(Sender.java:236)
! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:196)
! at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
! at java.lang.Thread.run(Thread.java:745)


> NewProducer NullPointerException on ProduceRequest
> --
>
> Key: KAFKA-3657
> URL: https://issues.apache.org/jira/browse/KAFKA-3657
> Project: Kafka
>  Issue Type: Bug
>  Components: network, producer 
>Affects Versions: 0.8.2.1
> Environment: linux 3.2.0 debian7
>Reporter: Vamsi Subhash Achanta
>Assignee: Jun Rao
>
> The producer upon send.get() on the future appends to the accumulator the 
> record batches and the Sender.java (separate thread) flushes it to the 
> server. The produce request waits on the countDownLatch in the 
> FutureRecordMetadata:
> public RecordMetadata get() throws InterruptedException, 
> ExecutionException {
> this.result.await();
> In this case, the client thread is blocked for ever (as it is get() without 
> timeout) for the response and the response upon poll by the Sender returns an 
> attachment with the batch value as null. The batch is processed and the 
> request is errored out. The Sender catches a global level exception and then 
> goes ahead. As the accumulator is drained, the response will never be 
> returned and the producer client thread calling get() is blocked for ever on 
> the latch await call.
> I checked at the server end but still haven't found the reason for null 
> batch. Any pointers on this?
> ERROR [2016-05-01 21:00:09,256] [kafka-producer-network-thread |producer-app] 
> [Sender] message_id: group_id: : Uncaught error in kafka producer I/O thread:
> ! java.lang.NullPointerException: null
> ! at 
> org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:266)
> ! at 
> 

[jira] [Commented] (KAFKA-3359) Parallel log-recovery of un-flushed segments on startup

2016-03-20 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199544#comment-15199544
 ] 

Vamsi Subhash Achanta commented on KAFKA-3359:
--

*bump*
Can this be reviewed?

> Parallel log-recovery of un-flushed segments on startup
> ---
>
> Key: KAFKA-3359
> URL: https://issues.apache.org/jira/browse/KAFKA-3359
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.2.2, 0.9.0.1
>Reporter: Vamsi Subhash Achanta
>Assignee: Jay Kreps
> Fix For: 0.10.0.0
>
>
> On startup, currently the log segments within a logDir are loaded 
> sequentially when there is a un-clean shutdown. This will take a lot of time 
> for the segments to be loaded as the logSegment.recover(..) is called for 
> every segment and for brokers which have many partitions, the time taken will 
> be very high (we have noticed ~40mins for 2k partitions).
> https://github.com/apache/kafka/pull/1035
> This pull request will make the log-segment load parallel with two 
> configurable properties "log.recovery.threads" and 
> "log.recovery.max.interval.ms".
> Logic:
> 1. Have a threadpool defined of fixed length (log.recovery.threads)
> 2. Submit the logSegment recovery as a job to the threadpool and add the 
> future returned to a job list
> 3. Wait till all the jobs are done within req. time 
> (log.recovery.max.interval.ms - default set to Long.Max).
> 4. If they are done and the futures are all null (meaning that the jobs are 
> successfully completed), it is considered done.
> 5. If any of the recovery jobs failed, then it is logged and 
> LogRecoveryFailedException is thrown
> 6. If the timeout is reached, LogRecoveryFailedException is thrown.
> The logic is backward compatible with the current sequential implementation 
> as the default thread count is set to 1.
> PS: I am new to Scala and the code might look Java-ish but I will be happy to 
> modify the code review changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3359) Parallel log-recovery of un-flushed segments on startup

2016-03-19 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-3359:
-
Priority: Major  (was: Minor)

> Parallel log-recovery of un-flushed segments on startup
> ---
>
> Key: KAFKA-3359
> URL: https://issues.apache.org/jira/browse/KAFKA-3359
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.2.2, 0.9.0.1
>Reporter: Vamsi Subhash Achanta
>Assignee: Jay Kreps
> Fix For: 0.10.0.0
>
>
> On startup, currently the log segments within a logDir are loaded 
> sequentially when there is a un-clean shutdown. This will take a lot of time 
> for the segments to be loaded as the logSegment.recover(..) is called for 
> every segment and for brokers which have many partitions, the time taken will 
> be very high (we have noticed ~40mins for 2k partitions).
> https://github.com/apache/kafka/pull/1035
> This pull request will make the log-segment load parallel with two 
> configurable properties "log.recovery.threads" and 
> "log.recovery.max.interval.ms".
> Logic:
> 1. Have a threadpool defined of fixed length (log.recovery.threads)
> 2. Submit the logSegment recovery as a job to the threadpool and add the 
> future returned to a job list
> 3. Wait till all the jobs are done within req. time 
> (log.recovery.max.interval.ms - default set to Long.Max).
> 4. If they are done and the futures are all null (meaning that the jobs are 
> successfully completed), it is considered done.
> 5. If any of the recovery jobs failed, then it is logged and 
> LogRecoveryFailedException is thrown
> 6. If the timeout is reached, LogRecoveryFailedException is thrown.
> The logic is backward compatible with the current sequential implementation 
> as the default thread count is set to 1.
> PS: I am new to Scala and the code might look Java-ish but I will be happy to 
> modify the code review changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3359) Parallel log-recovery of un-flushed segments on startup

2016-03-10 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-3359:
-
 Reviewer: Grant Henke
Fix Version/s: 0.10.0.0
   Status: Patch Available  (was: Open)

https://github.com/apache/kafka/pull/1035

> Parallel log-recovery of un-flushed segments on startup
> ---
>
> Key: KAFKA-3359
> URL: https://issues.apache.org/jira/browse/KAFKA-3359
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.9.0.1, 0.8.2.2
>Reporter: Vamsi Subhash Achanta
>Assignee: Jay Kreps
>Priority: Minor
> Fix For: 0.10.0.0
>
>
> On startup, currently the log segments within a logDir are loaded 
> sequentially when there is a un-clean shutdown. This will take a lot of time 
> for the segments to be loaded as the logSegment.recover(..) is called for 
> every segment and for brokers which have many partitions, the time taken will 
> be very high (we have noticed ~40mins for 2k partitions).
> https://github.com/apache/kafka/pull/1035
> This pull request will make the log-segment load parallel with two 
> configurable properties "log.recovery.threads" and 
> "log.recovery.max.interval.ms".
> Logic:
> 1. Have a threadpool defined of fixed length (log.recovery.threads)
> 2. Submit the logSegment recovery as a job to the threadpool and add the 
> future returned to a job list
> 3. Wait till all the jobs are done within req. time 
> (log.recovery.max.interval.ms - default set to Long.Max).
> 4. If they are done and the futures are all null (meaning that the jobs are 
> successfully completed), it is considered done.
> 5. If any of the recovery jobs failed, then it is logged and 
> LogRecoveryFailedException is thrown
> 6. If the timeout is reached, LogRecoveryFailedException is thrown.
> The logic is backward compatible with the current sequential implementation 
> as the default thread count is set to 1.
> PS: I am new to Scala and the code might look Java-ish but I will be happy to 
> modify the code review changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3359) Parallel log-recovery of un-flushed segments on startup

2016-03-09 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-3359:
-
Issue Type: Improvement  (was: Bug)

> Parallel log-recovery of un-flushed segments on startup
> ---
>
> Key: KAFKA-3359
> URL: https://issues.apache.org/jira/browse/KAFKA-3359
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.2.2, 0.9.0.1
>Reporter: Vamsi Subhash Achanta
>Assignee: Jay Kreps
>Priority: Minor
>
> On startup, currently the log segments within a logDir are loaded 
> sequentially when there is a un-clean shutdown. This will take a lot of time 
> for the segments to be loaded as the logSegment.recover(..) is called for 
> every segment and for brokers which have many partitions, the time taken will 
> be very high (we have noticed ~40mins for 2k partitions).
> https://github.com/apache/kafka/pull/1035
> This pull request will make the log-segment load parallel with two 
> configurable properties "log.recovery.threads" and 
> "log.recovery.max.interval.ms".
> Logic:
> 1. Have a threadpool defined of fixed length (log.recovery.threads)
> 2. Submit the logSegment recovery as a job to the threadpool and add the 
> future returned to a job list
> 3. Wait till all the jobs are done within req. time 
> (log.recovery.max.interval.ms - default set to Long.Max).
> 4. If they are done and the futures are all null (meaning that the jobs are 
> successfully completed), it is considered done.
> 5. If any of the recovery jobs failed, then it is logged and 
> LogRecoveryFailedException is thrown
> 6. If the timeout is reached, LogRecoveryFailedException is thrown.
> The logic is backward compatible with the current sequential implementation 
> as the default thread count is set to 1.
> PS: I am new to Scala and the code might look Java-ish but I will be happy to 
> modify the code review changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3359) Parallel log-recovery of un-flushed segments on startup

2016-03-09 Thread Vamsi Subhash Achanta (JIRA)
Vamsi Subhash Achanta created KAFKA-3359:


 Summary: Parallel log-recovery of un-flushed segments on startup
 Key: KAFKA-3359
 URL: https://issues.apache.org/jira/browse/KAFKA-3359
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 0.9.0.1, 0.8.2.2
Reporter: Vamsi Subhash Achanta
Assignee: Jay Kreps
Priority: Minor


On startup, currently the log segments within a logDir are loaded sequentially 
when there is a un-clean shutdown. This will take a lot of time for the 
segments to be loaded as the logSegment.recover(..) is called for every segment 
and for brokers which have many partitions, the time taken will be very high 
(we have noticed ~40mins for 2k partitions).

https://github.com/apache/kafka/pull/1035

This pull request will make the log-segment load parallel with two configurable 
properties "log.recovery.threads" and "log.recovery.max.interval.ms".

Logic:
1. Have a threadpool defined of fixed length (log.recovery.threads)
2. Submit the logSegment recovery as a job to the threadpool and add the future 
returned to a job list
3. Wait till all the jobs are done within req. time 
(log.recovery.max.interval.ms - default set to Long.Max).
4. If they are done and the futures are all null (meaning that the jobs are 
successfully completed), it is considered done.
5. If any of the recovery jobs failed, then it is logged and 
LogRecoveryFailedException is thrown
6. If the timeout is reached, LogRecoveryFailedException is thrown.

The logic is backward compatible with the current sequential implementation as 
the default thread count is set to 1.

PS: I am new to Scala and the code might look Java-ish but I will be happy to 
modify the code review changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2163) Offsets manager cache should prevent stale-offset-cleanup while an offset load is in progress; otherwise we can lose consumer offsets

2016-03-08 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185020#comment-15185020
 ] 

Vamsi Subhash Achanta commented on KAFKA-2163:
--

Can this be merged into 0.8.2.2 branch?

> Offsets manager cache should prevent stale-offset-cleanup while an offset 
> load is in progress; otherwise we can lose consumer offsets
> -
>
> Key: KAFKA-2163
> URL: https://issues.apache.org/jira/browse/KAFKA-2163
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Joel Koshy
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2163.patch
>
>
> When leadership of an offsets partition moves, the new leader loads offsets 
> from that partition into the offset manager cache.
> Independently, the offset manager has a periodic cleanup task for stale 
> offsets that removes old offsets from the cache and appends tombstones for 
> those. If the partition happens to contain much older offsets (earlier in the 
> log) and inserts those into the cache; the cleanup task may run and see those 
> offsets (which it deems to be stale) and proceeds to remove from the cache 
> and append a tombstone to the end of the log. The tombstone will override the 
> true latest offset and a subsequent offset fetch request will return no 
> offset.
> We just need to prevent the cleanup task from running during an offset load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2967) Move Kafka documentation to ReStructuredText

2016-03-02 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177342#comment-15177342
 ] 

Vamsi Subhash Achanta commented on KAFKA-2967:
--

Emacs org mode works very well for documentation: http://orgmode.org/
It can generate Latex output as well and developers feel like heaven writing 
documentation in it. It can be used by people who don't use Emacs as well.

> Move Kafka documentation to ReStructuredText
> 
>
> Key: KAFKA-2967
> URL: https://issues.apache.org/jira/browse/KAFKA-2967
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Gwen Shapira
>
> Storing documentation as HTML is kind of BS :)
> * Formatting is a pain, and making it look good is even worse
> * Its just HTML, can't generate PDFs
> * Reading and editting is painful
> * Validating changes is hard because our formatting relies on all kinds of 
> Apache Server features.
> I suggest:
> * Move to RST
> * Generate HTML and PDF during build using Sphinx plugin for Gradle.
> Lots of Apache projects are doing this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1229) Reload broker config without a restart

2015-05-24 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557737#comment-14557737
 ] 

Vamsi Subhash Achanta commented on KAFKA-1229:
--

Can this be revisited?

At-least the zookeeper.connect string?

 Reload broker config without a restart
 --

 Key: KAFKA-1229
 URL: https://issues.apache.org/jira/browse/KAFKA-1229
 Project: Kafka
  Issue Type: Wish
  Components: config
Affects Versions: 0.8.0
Reporter: Carlo Cabanilla
Priority: Minor

 In order to minimize client disruption, ideally you'd be able to reload 
 broker config without having to restart the server. On *nix system the 
 convention is to have the process reread its configuration if it receives a 
 SIGHUP signal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1977) Make logEndOffset available in the Zookeeper consumer

2015-05-18 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549252#comment-14549252
 ] 

Vamsi Subhash Achanta commented on KAFKA-1977:
--

Hi Will,

Even I have applied this patch to 0.8.2.2. All the tests passed and I am now 
using it on production. It seems to give correct values for the last 8 hours. 
Thanks for the patch.

 Make logEndOffset available in the Zookeeper consumer
 -

 Key: KAFKA-1977
 URL: https://issues.apache.org/jira/browse/KAFKA-1977
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Will Funnell
Priority: Minor
 Attachments: 
 Make_logEndOffset_available_in_the_Zookeeper_consumer.patch


 The requirement is to create a snapshot from the Kafka topic but NOT do 
 continual reads after that point. For example you might be creating a backup 
 of the data to a file.
 In order to achieve that, a recommended solution by Joel Koshy and Jay Kreps 
 was to expose the high watermark, as maxEndOffset, from the FetchResponse 
 object through to each MessageAndMetadata object in order to be aware when 
 the consumer has reached the end of each partition.
 The submitted patch achieves this by adding the maxEndOffset to the 
 PartitionTopicInfo, which is updated when a new message arrives in the 
 ConsumerFetcherThread and then exposed in MessageAndMetadata.
 See here for discussion:
 http://search-hadoop.com/m/4TaT4TpJy71



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1977) Make logEndOffset available in the Zookeeper consumer

2015-05-18 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549252#comment-14549252
 ] 

Vamsi Subhash Achanta edited comment on KAFKA-1977 at 5/18/15 9:21 PM:
---

Hi Will,

Even I have applied this patch to 0.8.2.1. All the tests passed and I am now 
using it on production. It seems to give correct values for all consumers. 
Thanks for the patch.


was (Author: vamsi360):
Hi Will,

Even I have applied this patch to 0.8.2.2. All the tests passed and I am now 
using it on production. It seems to give correct values for the last 8 hours. 
Thanks for the patch.

 Make logEndOffset available in the Zookeeper consumer
 -

 Key: KAFKA-1977
 URL: https://issues.apache.org/jira/browse/KAFKA-1977
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Will Funnell
Priority: Minor
 Attachments: 
 Make_logEndOffset_available_in_the_Zookeeper_consumer.patch


 The requirement is to create a snapshot from the Kafka topic but NOT do 
 continual reads after that point. For example you might be creating a backup 
 of the data to a file.
 In order to achieve that, a recommended solution by Joel Koshy and Jay Kreps 
 was to expose the high watermark, as maxEndOffset, from the FetchResponse 
 object through to each MessageAndMetadata object in order to be aware when 
 the consumer has reached the end of each partition.
 The submitted patch achieves this by adding the maxEndOffset to the 
 PartitionTopicInfo, which is updated when a new message arrives in the 
 ConsumerFetcherThread and then exposed in MessageAndMetadata.
 See here for discussion:
 http://search-hadoop.com/m/4TaT4TpJy71



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1977) Make logEndOffset available in the Zookeeper consumer

2015-05-18 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547606#comment-14547606
 ] 

Vamsi Subhash Achanta commented on KAFKA-1977:
--

can this be merged in the old consumer code in 0.8.2.x branch? We also have the 
same kind of requirement.

 Make logEndOffset available in the Zookeeper consumer
 -

 Key: KAFKA-1977
 URL: https://issues.apache.org/jira/browse/KAFKA-1977
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Will Funnell
Priority: Minor
 Attachments: 
 Make_logEndOffset_available_in_the_Zookeeper_consumer.patch


 The requirement is to create a snapshot from the Kafka topic but NOT do 
 continual reads after that point. For example you might be creating a backup 
 of the data to a file.
 In order to achieve that, a recommended solution by Joel Koshy and Jay Kreps 
 was to expose the high watermark, as maxEndOffset, from the FetchResponse 
 object through to each MessageAndMetadata object in order to be aware when 
 the consumer has reached the end of each partition.
 The submitted patch achieves this by adding the maxEndOffset to the 
 PartitionTopicInfo, which is updated when a new message arrives in the 
 ConsumerFetcherThread and then exposed in MessageAndMetadata.
 See here for discussion:
 http://search-hadoop.com/m/4TaT4TpJy71



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2134) Producer blocked on metric publish

2015-04-23 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-2134:
-
Priority: Blocker  (was: Major)

 Producer blocked on metric publish
 --

 Key: KAFKA-2134
 URL: https://issues.apache.org/jira/browse/KAFKA-2134
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2.1
 Environment: debian7, java8
Reporter: Vamsi Subhash Achanta
Assignee: Jun Rao
Priority: Blocker

 Hi,
 We have a REST api to publish to a topic. Yesterday, we started noticing that 
 the producer is not able to produce messages at a good rate and the 
 CLOSE_WAITs of our producer REST app are very high. All the producer REST 
 requests are hence timing out.
 When we took the thread dump and analysed it, we noticed that the threads are 
 getting blocked on JmxReporter metricChange. Here is the attached stack trace.
 dw-70 - POST /queues/queue_1/messages #70 prio=5 os_prio=0 
 tid=0x7f043c8bd000 nid=0x54cf waiting for monitor entry 
 [0x7f04363c7000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
 - waiting to lock 0x0005c1823860 (a java.lang.Object)
 at 
 org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:182)
 - locked 0x0007a5e526c8 (a 
 org.apache.kafka.common.metrics.Metrics)
 at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:165)
 - locked 0x0007a5e526e8 (a 
 org.apache.kafka.common.metrics.Sensor)
 When I looked at the code of metricChange method, it uses a synchronised 
 block on an object resource and it seems that it is held by another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2134) Producer blocked on metric publish

2015-04-23 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509083#comment-14509083
 ] 

Vamsi Subhash Achanta commented on KAFKA-2134:
--

We noticed this intermittently. What could be the problem?

 Producer blocked on metric publish
 --

 Key: KAFKA-2134
 URL: https://issues.apache.org/jira/browse/KAFKA-2134
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2.1
 Environment: debian7, java8
Reporter: Vamsi Subhash Achanta
Assignee: Jun Rao
Priority: Blocker

 Hi,
 We have a REST api to publish to a topic. Yesterday, we started noticing that 
 the producer is not able to produce messages at a good rate and the 
 CLOSE_WAITs of our producer REST app are very high. All the producer REST 
 requests are hence timing out.
 When we took the thread dump and analysed it, we noticed that the threads are 
 getting blocked on JmxReporter metricChange. Here is the attached stack trace.
 dw-70 - POST /queues/queue_1/messages #70 prio=5 os_prio=0 
 tid=0x7f043c8bd000 nid=0x54cf waiting for monitor entry 
 [0x7f04363c7000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
 - waiting to lock 0x0005c1823860 (a java.lang.Object)
 at 
 org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:182)
 - locked 0x0007a5e526c8 (a 
 org.apache.kafka.common.metrics.Metrics)
 at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:165)
 - locked 0x0007a5e526e8 (a 
 org.apache.kafka.common.metrics.Sensor)
 When I looked at the code of metricChange method, it uses a synchronised 
 block on an object resource and it seems that it is held by another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2134) Producer blocked on metric publish

2015-04-20 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-2134:
-
Description: 
Hi,

We have a REST api to publish to a topic. Yesterday, we started noticing that 
the producer is not able to produce messages at a good rate and the CLOSE_WAITs 
of our producer REST app are very high. All the producer REST requests are 
hence timing out.

When we took the thread dump and analysed it, we noticed that the threads are 
getting blocked on JmxReporter metricChange. Here is the attached stack trace.

dw-70 - POST /queues/queue_1/messages #70 prio=5 os_prio=0 
tid=0x7f043c8bd000 nid=0x54cf waiting for monitor entry [0x7f04363c7000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
- waiting to lock 0x0005c1823860 (a java.lang.Object)
at 
org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:182)
- locked 0x0007a5e526c8 (a 
org.apache.kafka.common.metrics.Metrics)
at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:165)
- locked 0x0007a5e526e8 (a org.apache.kafka.common.metrics.Sensor)

When I looked at the code of metricChange method, it uses a synchronised block 
on an object resource and it seems that it is held by another.

  was:
Hi,

We have a REST api to publish to a topic. Yesterday, we started noticing that 
the producer is not able to produce messages at a good rate and the CLOSE_WAITs 
of our producer REST app are very high. All the producer REST requests are 
hence timing out.

When we took the thread dump and analysed it, we noticed that the threads are 
getting blocked on JmxReporter metricChange. Here is the attached stack trace.

dw-70 - POST /queues/ekl_bigfoot_marvin_production_1/messages #70 prio=5 
os_prio=0 tid=0x7f043c8bd000 nid=0x54cf waiting for monitor entry 
[0x7f04363c7000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
- waiting to lock 0x0005c1823860 (a java.lang.Object)
at 
org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:182)
- locked 0x0007a5e526c8 (a 
org.apache.kafka.common.metrics.Metrics)
at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:165)
- locked 0x0007a5e526e8 (a org.apache.kafka.common.metrics.Sensor)

When I looked at the code of metricChange method, it uses a synchronised block 
on an object resource and it seems that it is held by another.


 Producer blocked on metric publish
 --

 Key: KAFKA-2134
 URL: https://issues.apache.org/jira/browse/KAFKA-2134
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2.1
 Environment: debian7, java8
Reporter: Vamsi Subhash Achanta
Assignee: Jun Rao

 Hi,
 We have a REST api to publish to a topic. Yesterday, we started noticing that 
 the producer is not able to produce messages at a good rate and the 
 CLOSE_WAITs of our producer REST app are very high. All the producer REST 
 requests are hence timing out.
 When we took the thread dump and analysed it, we noticed that the threads are 
 getting blocked on JmxReporter metricChange. Here is the attached stack trace.
 dw-70 - POST /queues/queue_1/messages #70 prio=5 os_prio=0 
 tid=0x7f043c8bd000 nid=0x54cf waiting for monitor entry 
 [0x7f04363c7000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
 - waiting to lock 0x0005c1823860 (a java.lang.Object)
 at 
 org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:182)
 - locked 0x0007a5e526c8 (a 
 org.apache.kafka.common.metrics.Metrics)
 at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:165)
 - locked 0x0007a5e526e8 (a 
 org.apache.kafka.common.metrics.Sensor)
 When I looked at the code of metricChange method, it uses a synchronised 
 block on an object resource and it seems that it is held by another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-25 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-1791:
-
Attachment: 0233.log

Attached the log file.

 Corrupt index after safe shutdown and restart
 -

 Key: KAFKA-1791
 URL: https://issues.apache.org/jira/browse/KAFKA-1791
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
 Environment: Debian6 with Sun-Java6
Reporter: Vamsi Subhash Achanta
Priority: Critical
 Attachments: 0233.index, 0233.log


 We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
 30 minutes to fix a problem with the bare metal. Upon restart, after some 
 time, the broker went out of file-descriptors (FDs) and started throwing 
 errors. Upon restart, it started throwing this corrupted index exceptions. I 
 followed the other JIRAs related to corrupted indices but the solutions 
 mentioned there (deleting the index and restart) didn't work - the index gets 
 created again. The other JIRAs solution of deleting those indexes which got 
 wrongly compacted ( 10MB) didn't work either.
 What is the error? How can I fix this and bring back the broker? Thanks.
 INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
 shutdown file. Skipping recovery for all logs in data directory 
 '/var/lib/fk-3p-kafka/logs'
  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
 'kf.production.b2b.return_order.status-25'
 FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
 index file 
 (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
  has non-zero size but the last offset is 233 and the base offset is 233
   at scala.Predef$.require(Predef.scala:145)
   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
   at 
 scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
   at 
 scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
   at kafka.log.Log.loadSegments(Log.scala:158)
   at kafka.log.Log.init(Log.scala:64)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
   at kafka.log.LogManager.init(LogManager.scala:57)
   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
   at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
   at kafka.Kafka$.main(Kafka.scala:46)
   at kafka.Kafka.main(Kafka.scala)
  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shutting down
  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shut down completed
  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
 [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-21 Thread Vamsi Subhash Achanta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsi Subhash Achanta updated KAFKA-1791:
-
Attachment: 0233.index

This was over a period of time when the exception is thrown again and again. 
The file size when copied increased to 10MB. Was it compressed?

 Corrupt index after safe shutdown and restart
 -

 Key: KAFKA-1791
 URL: https://issues.apache.org/jira/browse/KAFKA-1791
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
 Environment: Debian6 with Sun-Java6
Reporter: Vamsi Subhash Achanta
Priority: Critical
 Attachments: 0233.index


 We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
 30 minutes to fix a problem with the bare metal. Upon restart, after some 
 time, the broker went out of file-descriptors (FDs) and started throwing 
 errors. Upon restart, it started throwing this corrupted index exceptions. I 
 followed the other JIRAs related to corrupted indices but the solutions 
 mentioned there (deleting the index and restart) didn't work - the index gets 
 created again. The other JIRAs solution of deleting those indexes which got 
 wrongly compacted ( 10MB) didn't work either.
 What is the error? How can I fix this and bring back the broker? Thanks.
 INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
 shutdown file. Skipping recovery for all logs in data directory 
 '/var/lib/fk-3p-kafka/logs'
  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
 'kf.production.b2b.return_order.status-25'
 FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
 index file 
 (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
  has non-zero size but the last offset is 233 and the base offset is 233
   at scala.Predef$.require(Predef.scala:145)
   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
   at 
 scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
   at 
 scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
   at kafka.log.Log.loadSegments(Log.scala:158)
   at kafka.log.Log.init(Log.scala:64)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
   at kafka.log.LogManager.init(LogManager.scala:57)
   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
   at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
   at kafka.Kafka$.main(Kafka.scala:46)
   at kafka.Kafka.main(Kafka.scala)
  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shutting down
  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shut down completed
  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
 [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-21 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221851#comment-14221851
 ] 

Vamsi Subhash Achanta commented on KAFKA-1791:
--

Attached the file.

 Corrupt index after safe shutdown and restart
 -

 Key: KAFKA-1791
 URL: https://issues.apache.org/jira/browse/KAFKA-1791
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
 Environment: Debian6 with Sun-Java6
Reporter: Vamsi Subhash Achanta
Priority: Critical
 Attachments: 0233.index


 We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
 30 minutes to fix a problem with the bare metal. Upon restart, after some 
 time, the broker went out of file-descriptors (FDs) and started throwing 
 errors. Upon restart, it started throwing this corrupted index exceptions. I 
 followed the other JIRAs related to corrupted indices but the solutions 
 mentioned there (deleting the index and restart) didn't work - the index gets 
 created again. The other JIRAs solution of deleting those indexes which got 
 wrongly compacted ( 10MB) didn't work either.
 What is the error? How can I fix this and bring back the broker? Thanks.
 INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
 shutdown file. Skipping recovery for all logs in data directory 
 '/var/lib/fk-3p-kafka/logs'
  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
 'kf.production.b2b.return_order.status-25'
 FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
 index file 
 (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
  has non-zero size but the last offset is 233 and the base offset is 233
   at scala.Predef$.require(Predef.scala:145)
   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
   at 
 scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
   at 
 scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
   at kafka.log.Log.loadSegments(Log.scala:158)
   at kafka.log.Log.init(Log.scala:64)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
   at kafka.log.LogManager.init(LogManager.scala:57)
   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
   at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
   at kafka.Kafka$.main(Kafka.scala:46)
   at kafka.Kafka.main(Kafka.scala)
  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shutting down
  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shut down completed
  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
 [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-20 Thread Vamsi Subhash Achanta (JIRA)
Vamsi Subhash Achanta created KAFKA-1791:


 Summary: Corrupt index after safe shutdown and restart
 Key: KAFKA-1791
 URL: https://issues.apache.org/jira/browse/KAFKA-1791
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
 Environment: Debian6 with Sun-Java6
Reporter: Vamsi Subhash Achanta
Priority: Critical


We have 3 kafka brokers - all VMs. One of the broker was stopped for around 30 
minutes to fix a problem with the bare metal. Upon restart, after some time, 
the broker went out of file-descriptors (FDs) and started throwing errors. Upon 
restart, it started throwing this corrupted index exceptions. I followed the 
other JIRAs related to corrupted indices but the solutions mentioned there 
(deleting the index and restart) didn't work - the index gets created again. 
The other JIRAs solution of deleting those indexes which got wrongly compacted 
( 10MB) didn't work either.

What is the error? How can I fix this and bring back the broker? Thanks.

INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
shutdown file. Skipping recovery for all logs in data directory 
'/var/lib/fk-3p-kafka/logs'
 INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
'kf.production.b2b.return_order.status-25'
FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
Fatal error during KafkaServerStable startup. Prepare to shutdown
java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
index file 
(/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
 has non-zero size but the last offset is 233 and the base offset is 233
at scala.Predef$.require(Predef.scala:145)
at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at 
scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
at 
scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
at kafka.log.Log.loadSegments(Log.scala:158)
at kafka.log.Log.init(Log.scala:64)
at 
kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
at 
kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
at kafka.log.LogManager.loadLogs(LogManager.scala:105)
at kafka.log.LogManager.init(LogManager.scala:57)
at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
at kafka.Kafka$.main(Kafka.scala:46)
at kafka.Kafka.main(Kafka.scala)
 INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
Server 2], shutting down
 INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
Server 2], shut down completed
 INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - [Kafka 
Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1791) Corrupt index after safe shutdown and restart

2014-11-20 Thread Vamsi Subhash Achanta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220499#comment-14220499
 ] 

Vamsi Subhash Achanta commented on KAFKA-1791:
--

Its around 40 KB.

 Corrupt index after safe shutdown and restart
 -

 Key: KAFKA-1791
 URL: https://issues.apache.org/jira/browse/KAFKA-1791
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
 Environment: Debian6 with Sun-Java6
Reporter: Vamsi Subhash Achanta
Priority: Critical

 We have 3 kafka brokers - all VMs. One of the broker was stopped for around 
 30 minutes to fix a problem with the bare metal. Upon restart, after some 
 time, the broker went out of file-descriptors (FDs) and started throwing 
 errors. Upon restart, it started throwing this corrupted index exceptions. I 
 followed the other JIRAs related to corrupted indices but the solutions 
 mentioned there (deleting the index and restart) didn't work - the index gets 
 created again. The other JIRAs solution of deleting those indexes which got 
 wrongly compacted ( 10MB) didn't work either.
 What is the error? How can I fix this and bring back the broker? Thanks.
 INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Found clean 
 shutdown file. Skipping recovery for all logs in data directory 
 '/var/lib/fk-3p-kafka/logs'
  INFO [2014-11-21 02:57:17,510] [main][] kafka.log.LogManager - Loading log 
 'kf.production.b2b.return_order.status-25'
 FATAL [2014-11-21 02:57:17,533] [main][] kafka.server.KafkaServerStartable - 
 Fatal error during KafkaServerStable startup. Prepare to shutdown
 java.lang.IllegalArgumentException: requirement failed: Corrupt index found, 
 index file 
 (/var/lib/fk-3p-kafka/logs/kf.production.b2b.return_order.status-25/0233.index)
  has non-zero size but the last offset is 233 and the base offset is 233
   at scala.Predef$.require(Predef.scala:145)
   at kafka.log.OffsetIndex.sanityCheck(OffsetIndex.scala:352)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:159)
   at kafka.log.Log$$anonfun$loadSegments$5.apply(Log.scala:158)
   at scala.collection.Iterator$class.foreach(Iterator.scala:631)
   at 
 scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:474)
   at scala.collection.IterableLike$class.foreach(IterableLike.scala:79)
   at 
 scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:495)
   at kafka.log.Log.loadSegments(Log.scala:158)
   at kafka.log.Log.init(Log.scala:64)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:118)
   at 
 kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$4.apply(LogManager.scala:113)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:113)
   at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:105)
   at 
 scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
   at kafka.log.LogManager.loadLogs(LogManager.scala:105)
   at kafka.log.LogManager.init(LogManager.scala:57)
   at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:275)
   at kafka.server.KafkaServer.startup(KafkaServer.scala:72)
   at 
 kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
   at kafka.Kafka$.main(Kafka.scala:46)
   at kafka.Kafka.main(Kafka.scala)
  INFO [2014-11-21 02:57:17,534] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shutting down
  INFO [2014-11-21 02:57:17,538] [main][] kafka.server.KafkaServer - [Kafka 
 Server 2], shut down completed
  INFO [2014-11-21 02:57:17,539] [Thread-2][] kafka.server.KafkaServer - 
 [Kafka Server 2], shutting down



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)