date:20130717

partition key patch

2013-07-17 Thread Jay Kreps

Any one able to take a look at this?

https://issues.apache.org/jira/browse/KAFKA-925

-Jay

[jira] [Updated] (KAFKA-925) Add optional partition key override in producer

2013-07-17 Thread Jay Kreps (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jay Kreps updated KAFKA-925:

Attachment: KAFKA-925-v2.patch

Updated patch--rebased to trunk.

Add optional partition key override in producer
---

Key: KAFKA-925
URL: https://issues.apache.org/jira/browse/KAFKA-925
Project: Kafka
Issue Type: New Feature
Components: producer
Affects Versions: 0.8.1
Reporter: Jay Kreps
Assignee: Jay Kreps
Attachments: KAFKA-925-v1.patch, KAFKA-925-v2.patch

We have a key that is used for partitioning in the producer and stored with
the message. Actually these uses, though often the same, could be different.
The two meanings are effectively:
1. Assignment to a partition
2. Deduplication within a partition
In cases where we want to allow the client to take advantage of both of these
and they aren't the same it would be nice to allow them to be specified
separately.
To implement this I added an optional partition key to KeyedMessage. When
specified this key is used for partitioning rather than the message key. This
key is of type Any and the parametric typing is removed from the partitioner
to allow it to work with either key.
An alternative would be to allow the partition id to specified in the
KeyedMessage. This would be slightly more convenient in the case where there
is no partition key but instead you know a priori the partition number--this
case must be handled by giving the partition id as the partition key and
using an identity partitioner which is slightly more roundabout. However this
is inconsistent with the normal partitioning which requires a key in the case
where the partition is determined by a key--in that case you would be
manually calling your partitioner in user code. It seems best to me to either
use a key or always a partition and since we currently take a key I stuck
with that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: partition key patch

2013-07-17 Thread Wang Guozhang

I will do it.

On Wed, Jul 17, 2013 at 2:17 PM, Jay Kreps jay.kr...@gmail.com wrote:

 Any one able to take a look at this?

 https://issues.apache.org/jira/browse/KAFKA-925

 -Jay




-- 
-- Guozhang

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

2013-07-17 Thread Chris Riccomini (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711676#comment-13711676
]

Chris Riccomini commented on KAFKA-925:
---

Hey Jay,

Seems pretty reasonable to me. Is the reason for the type change in the
Partitioner so that you can handle either keys of type K (key) or keys of any
type (part key) using the same partitioner?

Cheers,
Chris

Add optional partition key override in producer
---

[jira] [Updated] (KAFKA-615) Avoid fsync on log segment roll

2013-07-17 Thread Jay Kreps (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jay Kreps updated KAFKA-615:

Attachment: KAFKA-615-v4.patch

Rebased patch to trunk.

Avoid fsync on log segment roll
---

Key: KAFKA-615
URL: https://issues.apache.org/jira/browse/KAFKA-615
Project: Kafka
Issue Type: Bug
Reporter: Jay Kreps
Assignee: Neha Narkhede
Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch,
KAFKA-615-v3.patch, KAFKA-615-v4.patch

It still isn't feasible to run without an application level fsync policy.
This is a problem as fsync locks the file and tuning such a policy so that
the flushes aren't so frequent that seeks reduce throughput, yet not so
infrequent that the fsync is writing so much data that there is a noticable
jump in latency is very challenging.
The remaining problem is the way that log recovery works. Our current policy
is that if a clean shutdown occurs we do no recovery. If an unclean shutdown
occurs we recovery the last segment of all logs. To make this correct we need
to ensure that each segment is fsync'd before we create a new segment. Hence
the fsync during roll.
Obviously if the fsync during roll is the only time fsync occurs then it will
potentially write out the entire segment which for a 1GB segment at 50mb/sec
might take many seconds. The goal of this JIRA is to eliminate this and make
it possible to run with no application-level fsyncs at all, depending
entirely on replication and background writeback for durability.

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

2013-07-17 Thread Guozhang Wang (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711690#comment-13711690
]

Guozhang Wang commented on KAFKA-925:
-

Hi Jay,

In the DefaultEventHandler, only the key is serialized and sent. The partition
key is used to determine the partition and then dropped. So the consumers would
not be able to read this partition key. Will this be a problem for, for example
MirrorMaker?

Guozhang

Add optional partition key override in producer
---

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

2013-07-17 Thread Jay Kreps (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711707#comment-13711707
]

Jay Kreps commented on KAFKA-925:
-

Yes the idea of this feature is to make it possible to partition by something
other than the stored key.

Add optional partition key override in producer
---

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

2013-07-17 Thread Jay Kreps (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711709#comment-13711709
]

Jay Kreps commented on KAFKA-925:
-

It is definitely true that downstream consumers cannot use the same key, though
a generic tool can always just retain the partition by setting the partition
number as the partition key and using a partitioner which just uses that number.

Add optional partition key override in producer
---

[jira] [Created] (KAFKA-979) Add jitter for time based rolling

2013-07-17 Thread Sriram Subramanian (JIRA)

Sriram Subramanian created KAFKA-979:


 Summary: Add jitter for time based rolling
 Key: KAFKA-979
 URL: https://issues.apache.org/jira/browse/KAFKA-979
 Project: Kafka
  Issue Type: Bug
Reporter: Sriram Subramanian


Currently, for low volume topics time based rolling happens at the same time. 
This causes a lot of IO on a typical cluster and creates back pressure. We need 
to add a jitter to prevent them from happening at the same time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-979) Add jitter for time based rolling

2013-07-17 Thread Swapnil Ghike (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13712002#comment-13712002
 ] 

Swapnil Ghike commented on KAFKA-979:
-

Hey Sriram, can you explain what we are trying to achieve here? I am not sure 
if I understood the meaning of jitter completely.

 Add jitter for time based rolling
 -

 Key: KAFKA-979
 URL: https://issues.apache.org/jira/browse/KAFKA-979
 Project: Kafka
  Issue Type: Bug
Reporter: Sriram Subramanian
Assignee: Sriram Subramanian

 Currently, for low volume topics time based rolling happens at the same time. 
 This causes a lot of IO on a typical cluster and creates back pressure. We 
 need to add a jitter to prevent them from happening at the same time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

partition key patch

[jira] [Updated] (KAFKA-925) Add optional partition key override in producer

Re: partition key patch

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

[jira] [Updated] (KAFKA-615) Avoid fsync on log segment roll

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

[jira] [Commented] (KAFKA-925) Add optional partition key override in producer

[jira] [Created] (KAFKA-979) Add jitter for time based rolling

[jira] [Commented] (KAFKA-979) Add jitter for time based rolling

10 matches

Site Navigation

Mail list logo

Footer information