[jira] [Comment Edited] (KAFKA-1955) Explore disk-based buffering in new Kafka Producer

2017-06-08 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043494#comment-16043494
 ] 

Jay Kreps edited comment on KAFKA-1955 at 6/8/17 9:43 PM:
--

I think the patch I submitted was kind of a cool hack, but after thinking about 
it I wasn't convinced it was really what you actually want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. I don't think this is really good enough. It means that it is 
okay if Kafka goes down, or if the app goes down, but not both. This helps but 
seems like not really what you want. But to properly handle app failure isn't 
that easy. For example, in the case of a OS crash the OS gives very weak 
guarantees on what is on disk for any data that hasn't been fsync'd. Not only 
can arbitrary bits of data be missing but it is even possible with some FS 
configurations to get arbitrary corrupt blocks that haven't been zero'd yet. I 
think to get this right you need a commit log and recovery procedure that 
verifies unsync'd data on startup. I'm not 100% sure you can do this with just 
the buffer pool, though maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.


was (Author: jkreps):
I think the patch I submitted was kind of a cool hack, but after thinking about 
it I wasn't convinced it was really what you actually want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.

> Explore disk-based buffering in new Kafka Producer
> --
>
> Key: KAFKA-1955
> URL: https://issues.apache.org/jira/browse/KAFKA-1955
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jay Kreps
> Attachments: KAFKA-1955.patch, 
> KAFKA-1955-RABASED-TO-8th-AUG-2015.patch
>
>
> There are two approaches to using Kafka for capturing event data that has no 
> other "source of truth store":
> 1. Just write to Kafka and try hard to keep the Kafka cluster up as you would 
> a database.
> 2. Write to some kind of local disk store and copy from that to Kafka.
> The cons of the second approach are the following:
> 1. You end up depending on disks on all the producer machines. If you have 
> 1 producers, that is 10k places state is kept. These tend to fail a lot.
> 2. You can get data arbitrarily delayed
> 3. You still don't tolerate hard outages since there is no replication in the 
> producer tier
> 4. This tends to make problems with duplicates more common in certain failure 
> scenarios.
> There is one big pro, though: you don't have to keep Kafka running all the 
> time.
> So far we have done nothing in Kafka to help support approach (2), 

[jira] [Comment Edited] (KAFKA-1955) Explore disk-based buffering in new Kafka Producer

2017-06-08 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043494#comment-16043494
 ] 

Jay Kreps edited comment on KAFKA-1955 at 6/8/17 9:41 PM:
--

I think the patch I submitted was kind of a cool hack, but after thinking about 
it I wasn't convinced it was really what you actually want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.


was (Author: jkreps):
I think the patch I submitted was kind of a cool hack, but after thinking about 
it I don't think it is really what you want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.

> Explore disk-based buffering in new Kafka Producer
> --
>
> Key: KAFKA-1955
> URL: https://issues.apache.org/jira/browse/KAFKA-1955
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jay Kreps
> Attachments: KAFKA-1955.patch, 
> KAFKA-1955-RABASED-TO-8th-AUG-2015.patch
>
>
> There are two approaches to using Kafka for capturing event data that has no 
> other "source of truth store":
> 1. Just write to Kafka and try hard to keep the Kafka cluster up as you would 
> a database.
> 2. Write to some kind of local disk store and copy from that to Kafka.
> The cons of the second approach are the following:
> 1. You end up depending on disks on all the producer machines. If you have 
> 1 producers, that is 10k places state is kept. These tend to fail a lot.
> 2. You can get data arbitrarily delayed
> 3. You still don't tolerate hard outages since there is no replication in the 
> producer tier
> 4. This tends to make problems with duplicates more common in certain failure 
> scenarios.
> There is one big pro, though: you don't have to keep Kafka running all the 
> time.
> So far we have done nothing in Kafka to help support approach (2), but people 
> have built a lot of buffering things. It's not clear that this is necessarily 
> bad.
> However implementing this in the new Kafka producer might actually be quite 
> easy. Here is an idea for how to do it. Implementation of this idea is 
>