Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-03-08 Thread Dong Lin
exposing stats like the individual log size as JMX. So, one way is to just > add new jmx to expose the log directory of individual replicas. > > Thanks, > > Jun > > > On Thu, Mar 2, 2017 at 11:18 PM, Dong Lin wrote: > > > Hey Jun, > > > > Thanks

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-07 Thread Dong Lin
2. In the notification znode we have Event field as an integer. Can we > document what is the value of LogDirFailure? And also are there any other > possible values? > > Thanks, > > Jiangjie (Becket) Qin > > On Tue, Mar 7, 2017 at 11:30 AM, Dong Lin wrote: > > > Hey

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-07 Thread Dong Lin
licas on the good log directories". > > 5. In the protocol definition, we have isNewReplica, but it should probably > be is_new_replica. > Good point. My bad. It is fixed now. > > Thanks, > Ismael > > > On Thu, Jan 12, 2017 at 6:46 PM, Dong Lin wrote: > &

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-07 Thread Dong Lin
> > 4. When we move the purgeBefore() api to the Java AdminClient, it would be > great if the api looks comparable to what's in KIP-117. For now, perhaps we > can mark the api in Scala as unstable so that people are aware that it's > subject to change? > > Thanks, &g

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-03-06 Thread Dong Lin
>> > >> Currently it is a count instead of rate. In practice, it seems count is > >> easier to use in this case. But I am open to change. > >> > >> Thanks, > >> > >> Jiangjie (Becket) Qin > >> > >> On Fri, Mar 3, 2017 at

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-05 Thread Dong Lin
g case for JBOD, > perhaps we should discuss KIP-113 before voting for both? I left some > comments in the other thread. > > Thanks, > > Jun > > On Wed, Mar 1, 2017 at 1:58 PM, Dong Lin wrote: > > > Hey Jun, > > > > Do you think it is OK to keep the existing wir

[jira] [Updated] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connect

2017-03-05 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4841: Description: KAFKA-4820 allows new request to be enqueued to unsent by user thread while some other thread

[jira] [Updated] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connect

2017-03-05 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4841: Description: KAFKA-4820 allows new request to be enqueued to unsent by user thread while some other thread

[jira] [Updated] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connect

2017-03-05 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4841: Summary: NetworkClient should only consider a connection to be fail after attempt to connect (was

[jira] [Created] (KAFKA-4841) NetworkClient should only consider a connection to be fail after attempt to connct

2017-03-05 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4841: --- Summary: NetworkClient should only consider a connection to be fail after attempt to connct Key: KAFKA-4841 URL: https://issues.apache.org/jira/browse/KAFKA-4841 Project

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-03-03 Thread Dong Lin
Hey Becket, I haven't looked at the patch yet. But since we are going to try the split-on-oversize solution, should the KIP also add a sensor that shows the rate of split per second and the probability of split? Thanks, Dong On Fri, Mar 3, 2017 at 6:39 PM, Becket Qin wrote: > Just to clarify,

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-03 Thread Dong Lin
consider > > if `Before` should be in the method name or should be in the parameter > > class. Just an example to describe what I mean, one could say > > `deleteRecords(DeleteRecordsParams.before(offsetsForPartition)`. That > way, > > we could provide a different way of

Re: [VOTE] KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-03-03 Thread Dong Lin
+1 (non-binding) On Thu, Mar 2, 2017 at 11:18 AM, Becket Qin wrote: > Thanks for the clarification, Ismael. In that case, it is reasonable to > drop support for Scala 2.10. LinkedIn is probably fine with this change. > > I did not notice we have recommended Scala version on the download page. >

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-03 Thread Dong Lin
;> purgeDataBefore(Map > offsetForPartition)`. In the AdminClient KIP (KIP-117), we are using > classes to encapsulate the parameters and result. We should probably do the > same in this KIP for consistency. Once we do that, we should also consider > if `Before` should be in the meth

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-03-02 Thread Dong Lin
ent on ChangeReplicaDirRequest vs > ChangeReplicaRequest. > I think ChangeReplicaRequest and ChangeReplicaResponse is my typo. Sorry, they are fixed now. > > Thanks, > > Jun > > > On Fri, Feb 3, 2017 at 6:19 PM, Dong Lin wrote: > > > Hey ALexey, > > &

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-02 Thread Dong Lin
: > Hi, Dong, > > It seems that delete means removing everything while purge means removing a > portion. So, it seems that it's better to be able to distinguish the two? > > Thanks, > > Jun > > On Wed, Mar 1, 2017 at 1:57 PM, Dong Lin wrote: > > > Hi al

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-03-01 Thread Dong Lin
Hey Jun, Do you think it is OK to keep the existing wire protocol in the KIP? I am wondering if we can initiate vote for this KIP. Thanks, Dong On Tue, Feb 28, 2017 at 2:41 PM, Dong Lin wrote: > Hey Jun, > > I just realized that StopReplicaRequest itself doesn't specify the

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-01 Thread Dong Lin
an option. Personally I don't have strong preference between "purge" and "delete". I am wondering if anyone object to this change. Thanks, Dong On Wed, Mar 1, 2017 at 9:46 AM, Dong Lin wrote: > Hi Ismael, > > I actually mean log_start_offset. I realized that it is

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-03-01 Thread Dong Lin
gt; only find the latter in the KIP. If so, would log_start_offset be a better > name? > > Ismael > > On Tue, Feb 28, 2017 at 4:26 AM, Dong Lin wrote: > > > Hi Jun and everyone, > > > > I would like to change the KIP in the following way. Currentl

[jira] [Updated] (KAFKA-4820) ConsumerNetworkClient.send() should not require global lock

2017-02-28 Thread Dong Lin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-4820: Description: Currently `ConsumerNetworkClient.send()` needs to acquire global lock of

[jira] [Created] (KAFKA-4820) ConsumerNetworkClient.send() should not require global lock

2017-02-28 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4820: --- Summary: ConsumerNetworkClient.send() should not require global lock Key: KAFKA-4820 URL: https://issues.apache.org/jira/browse/KAFKA-4820 Project: Kafka Issue Type

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-28 Thread Dong Lin
y the isNewReplica for the broker that receives LeaderAndIsrRequest. Thanks, Dong On Tue, Feb 28, 2017 at 2:14 PM, Dong Lin wrote: > Hi Jun, > > Yeah there is tradeoff between controller's implementation complexity vs. > wire-protocol complexity. I personally think it is more im

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-28 Thread Dong Lin
Thanks for the feedback. That's very useful. > > Jun > > On Tue, Feb 28, 2017 at 10:25 AM, Dong Lin wrote: > > > Hey Jun, > > > > Certainly, I have added Todd to reply to the thread. And I have updated > the > > item to in the wiki. > > >

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-02-28 Thread Dong Lin
Thanks Jun. I have updated the KIP to reflect this change. On Tue, Feb 28, 2017 at 9:44 AM, Jun Rao wrote: > Hi, Dong, > > Yes, this change makes sense to me. > > Thanks, > > Jun > > On Mon, Feb 27, 2017 at 8:26 PM, Dong Lin wrote: > > > Hi Jun and everyone

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-28 Thread Dong Lin
to start a > separate discussion thread on KIP-113? I do have some comments there. > > Thanks for working on this! > > Jun > > > On Mon, Feb 27, 2017 at 5:51 PM, Dong Lin wrote: > > > Hi Jun, > > > > In addition to the Eno's reference of why rebuild

Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-02-27 Thread Dong Lin
can allow purge operation to succeed when some replica is offline. Are you OK with this change? If so, I will go ahead to update the KIP and implement this behavior. Thanks, Dong On Tue, Jan 17, 2017 at 10:18 AM, Dong Lin wrote: > Hey Jun, > > Do you have time to review the KIP agai

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-27 Thread Dong Lin
ader will shrink ISR, expand it and shrink it > again after the timeout. > > The KIP seems to still reference " > /broker/topics/[topic]/partitions/[partitionId]/controller_managed_state". > > Thanks, > > Jun > > On Sat, Feb 25, 2017 at 7:49 PM, Dong Lin

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-25 Thread Dong Lin
n see, we will be adding some complexity to support > JBOD in Kafka one way or another. If we can tune the performance of RAID5 > to match that of RAID10, perhaps using RAID5 is a simpler solution. > > Thanks, > > Jun > > > On Fri, Feb 24, 2017 at 10:17 AM, Dong Lin wro

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-24 Thread Dong Lin
Hey Jun, I don't think we should allow failed replicas to be re-created on the good disks. Say there are 2 disks and each of them is 51% loaded. If any disk fail, and we allow replicas to be re-created on the other disks, both disks will fail. Alternatively we can disable replica creation if there

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-23 Thread Dong Lin
sks. But controller still needs to learn about offline replicas from LeaderAndIsrResponse. I think this is better than the current design. Do you have any concern with this design? Thanks, Dong On Thu, Feb 23, 2017 at 7:12 PM, Dong Lin wrote: > Hey Jun, > > Sure, here is my explanation.

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-23 Thread Dong Lin
a pretty minor issue with design B. Thanks, Dong On Thu, Feb 23, 2017 at 6:46 PM, Jun Rao wrote: > Hi, Dong, > > My replies are inlined below. > > On Thu, Feb 23, 2017 at 4:47 PM, Dong Lin wrote: > > > Hey Jun, > > > > Thanks for you reply! Let me fir

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-23 Thread Dong Lin
nually across disks? > Good point. Well, if all log directories are available, the failed log > directory path will be cleared. In the rarer case that a log directory is > still offline and one of the replicas registered in the failed log > directory shows up in another available log d

Re: [DISCUSS] KIP-125: ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback

2017-02-23 Thread Dong Lin
> > "dual.commit.enabled" set to true as well as "offsets.storage" set to > > kafka. The combination of these configs results in the consumer fetching > > offsets from both kafka and zookeeper and just picking the greater of the > > two. > > > > On Mon

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-23 Thread Dong Lin
an important change depending on the answer to 1) above. We probably need to document this more explicitly. Dong On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin wrote: > Hey Jun, > > Yeah you are right. I thought it wasn't because at LinkedIn it will be too > much pressure on

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-23 Thread Dong Lin
+1 (non-binding) On Wed, Feb 22, 2017 at 10:52 PM, Manikumar wrote: > +1 (non-binding) > > On Thu, Feb 23, 2017 at 3:27 AM, Mayuresh Gharat < > gharatmayures...@gmail.com > > wrote: > > > Hi Jun, > > > > Thanks a lot for the comments and reviews. > > I agree we should log the username. > > What

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-23 Thread Dong Lin
be no reuse of existing metrics/sensors. The new ones > for > > > > request processing time based throttling will be completely > independent > > > of > > > > existing metrics/sensors, but will be consistent in format. > > > > > > > > The

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-22 Thread Dong Lin
Hey Rajini, I think it makes a lot of sense to use io_thread_units as metric to quota user's traffic here. LGTM overall. I have some questions regarding sensors. - Can you be more specific in the KIP what sensors will be added? For example, it will be useful to specify the name and attributes of

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-22 Thread Dong Lin
sure that will be the case. This is at least a concern where MM is mirroring traffic for only a few partitions of high byte-in rate. Thus I am wondering if we should do the optimization proposed above. Thanks, Dong On Wed, Feb 22, 2017 at 6:39 PM, Dong Lin wrote: > Hey Becket, > > Thank

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-22 Thread Dong Lin
Hey Becket, Thanks for the KIP. I have one question here. Suppose producer's batch.size=100 KB, max.in.flight.requests.per.connection=1. Since each ProduceRequest contains one batch per partition, it means that 100 KB compressed data will be produced per partition per round-trip time as of curren

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-22 Thread Dong Lin
caRequest to offline > replica is to handle controlled shutdown. In that case, a broker is still > alive, but indicates to the controller that it plans to shut down. Being > able to stop the replica in the shutting down broker reduces churns in ISR. > So, for simplicity, it's prob

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-21 Thread Dong Lin
Hey Jun, Motivated by your suggestion, I think we can also store the information of created replicas in per-broker znode at /brokers/created_replicas/ids/[id]. Does this sound good? Regards, Dong On Tue, Feb 21, 2017 at 2:37 PM, Dong Lin wrote: > Hey Jun, > > Thanks much for your

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-21 Thread Dong Lin
his during broker startup, it can probably > just log an error and exit. The admin can remove the redundant partitions > manually and then restart the broker. > > Thanks, > > Jun > > On Sat, Feb 18, 2017 at 9:31 PM, Dong Lin wrote: > > > Hey Jun, > &

Re: [DISCUSS] KIP-125: ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback

2017-02-20 Thread Dong Lin
Hey Onur, Thanks for the well-written KIP! I have two questions below. 1) In the process of migrating from OZKCCs and MDZKCCs to MEZKCCs, we will may a mix of OZKCCs, MDZKCCs and MEZKCCs. OZKCC and MDZKCC will only commit to zookeeper and MDZKCC will use kafka-based offset storage. Would we lose

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-20 Thread Dong Lin
a, while it seems easy to enforce and > intuitive, > > there are some caveats. > > 1. Users do not have direct control over the request rate, i.e. users do > > not known when a request will be sent by the clients. > > 2. Each request may require different amount of CPU reso

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-18 Thread Dong Lin
information in the KIP on > where this window would be configured. > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin wrote: > > > To correct the typo above: It seems to me that determination of request > > rate is not any more difficult than determination of *byte* rate as

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-18 Thread Dong Lin
Hey Jun, Could you please let me know if the solutions above could address your concern? I really want to move the discussion forward. Thanks, Dong On Tue, Feb 14, 2017 at 8:17 PM, Dong Lin wrote: > Hey Jun, > > Thanks for all your help and time to discuss this KIP. When you get t

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-17 Thread Dong Lin
To correct the typo above: It seems to me that determination of request rate is not any more difficult than determination of *byte* rate as both metrics are commonly used to measure performance and provide guarantee to user. On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin wrote: > Hey Raj

Re: [DISCUSS] KIP-124: Request rate quotas

2017-02-17 Thread Dong Lin
Hey Rajini, Thanks for the KIP. I have some questions: - I am wondering why throttling based on request rate is listed as a rejected alternative. Can you provide more specific reason why it is difficult for administrators to decide request rates to allocate? It seems to me that determination of r

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-16 Thread Dong Lin
Hey Colin, Thanks for the update. I have two comments: - I actually think it is simpler and good enough to have per-topic API instead of batch-of-topic API. This is different from the argument for batch-of-partition API because, unlike operation on topic, people usually operate on multiple partit

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-14 Thread Dong Lin
ZK read time should be less than 10% of the existing total zk read time during controller failover. Thanks! Dong On Tue, Feb 14, 2017 at 7:30 AM, Dong Lin wrote: > Hey Jun, > > I just realized that you may be suggesting that a tool for listing offline > directories is necessa

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-14 Thread Dong Lin
clude this script in KIP-113. Regardless, my hope is to finish both KIPs ASAP and make them in the same release since both KIPs are needed for the JBOD setup. Thanks, Dong On Mon, Feb 13, 2017 at 5:52 PM, Dong Lin wrote: > And the test plan has also been updated to simulate disk failure by &

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-13 Thread Dong Lin
And the test plan has also been updated to simulate disk failure by changing log directory permission to 000. On Mon, Feb 13, 2017 at 5:50 PM, Dong Lin wrote: > Hi Jun, > > Thanks for the reply. These comments are very helpful. Let me answer them > inline. > > > On Mon, Fe

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-13 Thread Dong Lin
Hi Jun, Thanks for the reply. These comments are very helpful. Let me answer them inline. On Mon, Feb 13, 2017 at 3:25 PM, Jun Rao wrote: > Hi, Dong, > > Thanks for the reply. A few more replies and new comments below. > > On Fri, Feb 10, 2017 at 4:27 PM, Dong Lin wrot

[jira] [Created] (KAFKA-4763) Handle disk failure for JBOD (KIP-112)

2017-02-13 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4763: --- Summary: Handle disk failure for JBOD (KIP-112) Key: KAFKA-4763 URL: https://issues.apache.org/jira/browse/KAFKA-4763 Project: Kafka Issue Type: Improvement

Re: [VOTE] KIP-48 Support for delegation tokens as an authentication mechanism

2017-02-13 Thread Dong Lin
+1 (non-binding) On Mon, Feb 13, 2017 at 10:21 AM, Harsha Chintalapani wrote: > +1. > -Harsha > > On Fri, Feb 10, 2017 at 11:12 PM Manikumar > wrote: > > > Yes, owners and the renewers can always describe their own tokens. > Updated > > the KIP. > > > > On Sat, Feb 11, 2017 at 3:12 AM, Jun Rao

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-11 Thread Dong Lin
. offsetsForTimes) which are typically used for operation on multiple partitions at a time. On Fri, Feb 10, 2017 at 5:05 PM, Dong Lin wrote: > Hi Jun, > > Currently KIP-107 uses this API: > > Future> > purgeDataBefore(Map Long> offsetForPartition) > > Are you suggesti

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-10 Thread Dong Lin
gt; > Thanks, > > Jun > > On Thu, Feb 9, 2017 at 10:54 AM, Dong Lin wrote: > > > Thanks for the explanation. This makes sense. > > > > Best, > > Dong > > > > On Thu, Feb 9, 2017 at 10:51 AM, Colin McCabe > wrote: > > > > >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-10 Thread Dong Lin
nk Kafka may be easier to develop in the long term if we separate these two requests. I agree that ideally we want to create replicas in the right log directory in the first place. But I am not sure if there is any performance or correctness concern with the existing way of moving it after it is crea

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-10 Thread Dong Lin
comments, Dong On Thu, Feb 9, 2017 at 4:45 PM, Dong Lin wrote: > > > On Thu, Feb 9, 2017 at 3:37 PM, Colin McCabe wrote: > >> On Thu, Feb 9, 2017, at 11:40, Dong Lin wrote: >> > Thanks for all the comments Colin! >> > >> > To answer your questions: >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-09 Thread Dong Lin
On Thu, Feb 9, 2017 at 3:37 PM, Colin McCabe wrote: > On Thu, Feb 9, 2017, at 11:40, Dong Lin wrote: > > Thanks for all the comments Colin! > > > > To answer your questions: > > - Yes, a broker will shutdown if all its log directories are bad. > > That make

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-09 Thread Dong Lin
the > index and some other files. Clearly, log dirs that are completely > inaccessible will still be considered bad after a broker process bounce. > > best, > Colin > > > > > +1 (non-binding) aside from that > > > > > > > > On Wed, Feb 8, 2017, a

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-09 Thread Dong Lin
Thanks for the explanation. This makes sense. Best, Dong On Thu, Feb 9, 2017 at 10:51 AM, Colin McCabe wrote: > On Wed, Feb 8, 2017, at 19:02, Dong Lin wrote: > > I am not aware of any semantics that will be caused by sharing > > NetworkClient between producer/consumer and Adm

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-08 Thread Dong Lin
ent. Also, the > NetworkClient is an internal class which is not really meant for users. Do > we really want to open that up? Is the only benefit saving the number of > connections? Seems not worth it in my opinion. > > -Jason > > On Wed, Feb 8, 2017 at 6:43 PM, Dong Lin wrote: > >

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-08 Thread Dong Lin
BTW, the idea to share NetworkClient is suggested by Radai and I like this idea. On Wed, Feb 8, 2017 at 6:39 PM, Dong Lin wrote: > Hey Colin, > > Thanks for updating the KIP. I have two followup questions: > > - It seems that setCreationConfig(...) is a bit redundant given that m

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-08 Thread Dong Lin
Hey Colin, Thanks for updating the KIP. I have two followup questions: - It seems that setCreationConfig(...) is a bit redundant given that most arguments (e.g. topic name, partition num) are already passed to TopicsContext.create(...) when user creates topic. Should we pass the creationConfig as

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-08 Thread Dong Lin
bad so that it will not be used. Thanks, Dong On Tue, Feb 7, 2017 at 5:23 PM, Dong Lin wrote: > Hey Eno, > > Thanks much for the comment! > > I still think the complexity added to Kafka is justified by its benefit. > Let me provide my reasons below. > > 1) The additiona

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-07 Thread Dong Lin
Hey Jorge, Thanks for the KIP. I have some quick comments: - Should we allow user to use wildcard to reset offset of all groups for a given topic as well? - Should we allow user to specify timestamp per topic partition in the json file as well? - Should the script take some credential file to mak

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-07 Thread Dong Lin
complexity added to Kafka is justified. > Operationally it seems to me an admin will still have to do all the three > items above. > > Looking forward to the discussion > Thanks > Eno > > > > On 1 Feb 2017, at 17:21, Dong Lin wrote: > > > > Hey Eno, > &g

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-07 Thread Dong Lin
t; below. > > On Mon, Feb 6, 2017 at 7:22 PM, Dong Lin wrote: > > > Hey Jun, > > > > Thanks for the review! Please see reply inline. > > > > On Mon, Feb 6, 2017 at 6:21 PM, Jun Rao wrote: > > > > > Hi, Dong, > > > > > > Thanks for

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-07 Thread Dong Lin
can pool a few > > disks together to create a volume/directory and give that to Kafka. > > > > > > The kernel of my question will be that the admin already has tools to 1) > > create volumes/directories from a JBOD and 2) start a broker on a desired > > machine and

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-06 Thread Dong Lin
But I am not very sure what we gain by using RPC instead of ZK. Should we have a separate KIP in the future to migrate all existing notification to using RPC? > > Jun > > > On Wed, Jan 25, 2017 at 1:50 PM, Dong Lin wrote: > > > Hey Colin, > > > > Good point! Yeah

[jira] [Created] (KAFKA-4735) Fix deadlock issue during MM shutdown

2017-02-05 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4735: --- Summary: Fix deadlock issue during MM shutdown Key: KAFKA-4735 URL: https://issues.apache.org/jira/browse/KAFKA-4735 Project: Kafka Issue Type: Bug

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-02-03 Thread Dong Lin
ki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=67638408&selectedPageVersions=5&selectedPageVersions=6>. The idea is to use the same replication quota mechanism introduced in KIP-73. Thanks, Dong On Wed, Feb 1, 2017 at 2:16 AM, Alexey Ozeritsky wrote: > > > 24.01.

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-03 Thread Dong Lin
Feb 2, 2017, at 17:54, Dong Lin wrote: > > Hey Colin, > > > > Thanks for the KIP. I have a few comments below: > > > > - I share similar view with Ismael that a Future-based API is better. > > PurgeDataFrom() is an example API that user may want to do it > >

Re: [DISCUSS] KIP-117: Add a public AdministrativeClient API for Kafka admin operations

2017-02-02 Thread Dong Lin
Hey Colin, Thanks for the KIP. I have a few comments below: - I share similar view with Ismael that a Future-based API is better. PurgeDataFrom() is an example API that user may want to do it asynchronously even though there is only one request in flight at a time. In the future we may also have

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-02 Thread Dong Lin
Hey Eno, I forgot that. Sure, that works for us. Thanks, Dong On Thu, Feb 2, 2017 at 2:03 AM, Eno Thereska wrote: > Hi Dong, > > The KIP meetings are traditionally held at 11am. Would that also work? So > Tuesday 7th at 11am? > > Thanks > Eno > > > On 2 Feb 2

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-01 Thread Dong Lin
erface > >> that needs to be implemented by custom principal. > >> -> Doing this might be backwards incompatible as we need to > >> preserve the existing behavior of kafka-acls.sh. Also as we have field > of > >> PrincipalType which can be used in fut

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
Sorry for the typo. I mean that before the KIP meeting, please free feel to provide comment in this email thread so that discussion in the KIP meeting can be more efficient. On Wed, Feb 1, 2017 at 6:53 PM, Dong Lin wrote: > Hey Eno, Colin, > > Would you have time next Tuesday morning t

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
concern the KIP after the KIP meeting. In the meeting time, please feel free to provide comment in the thread so that discussion in the KIP meeting can be more efficient. Thanks, Dong On Wed, Feb 1, 2017 at 5:43 PM, Dong Lin wrote: > Hey Colin, > > Thanks much for the comment. Please see

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
Hey Colin, Thanks much for the comment. Please see my reply inline. On Wed, Feb 1, 2017 at 1:54 PM, Colin McCabe wrote: > On Wed, Feb 1, 2017, at 11:35, Dong Lin wrote: > > Hey Grant, Colin, > > > > My bad, I misunderstood Grant's suggestion initially. Indeed this is

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
isk errors. Formerly it was OK to just crash on a > > disk error; now it is not. It would be nice to see more in the test > > plan about injecting IOExceptions into disk handling code and verifying > > that we can handle it correctly. > > > > regards, > >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
eep > their data safe and the cluster operators would see a shrink in many ISRs > and hopefully an obvious log message leading to a quick fix. I haven't > thought through this idea in depth though. So there could be some > shortfalls. > > Thanks, > Grant > >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-02-01 Thread Dong Lin
gt; saying (as I understand it). > > Disks are not the only resource on a machine, there are several instances > where multiple NICs are used for example. Do we want fine grained > management of all these resources? I'd argue that opens us the system to a > lot of complexity. >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-31 Thread Dong Lin
with more > running instances. > > is anyone running kafka with anywhere near 100GB heaps? i thought the point > was to rely on kernel page cache to do the disk buffering > > On Thu, Jan 26, 2017 at 11:00 AM, Dong Lin wrote: > > > Hey Colin, > > >

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-31 Thread Dong Lin
This thread was been closed on Jan 18. We had more discussion after Guozhang's feedback on Jan 21. But no major change was made to the KIP after the discussion. On Tue, Jan 31, 2017 at 5:47 PM, Dong Lin wrote: > Hey Apurva, > > I think the KIP table in https://cwiki.apache.org

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-31 Thread Dong Lin
purgeDataFrom(). We can keep it that way. > > > > Thanks, > > > > Jun > > > > On Mon, Jan 23, 2017 at 4:24 PM, Dong Lin wrote: > > > > > Hi all, > > > > > > When I am implementing the patch, I realized that the current usage of >

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-26 Thread Dong Lin
Hey Colin, Thanks much for the comment. Please see me comment inline. On Thu, Jan 26, 2017 at 10:15 AM, Colin McCabe wrote: > On Wed, Jan 25, 2017, at 13:50, Dong Lin wrote: > > Hey Colin, > > > > Good point! Yeah we have actually considered and tested this solution,

Re: [VOTE] KIP-115: Enforce offsets.topic.replication.factor

2017-01-25 Thread Dong Lin
+1 On Wed, Jan 25, 2017 at 4:37 PM, Ismael Juma wrote: > +1 (binding) > > Ismael > > On Thu, Jan 26, 2017 at 12:34 AM, Onur Karaman < > onurkaraman.apa...@gmail.com > > wrote: > > > I'd like to start the vote for KIP-115: Enforce > > offsets.topic.replication.factor > > > > https://cwiki.apache.

Re: [DISCUSS] KIP-115: Enforce offsets.topic.replication.factor

2017-01-25 Thread Dong Lin
+1 On Wed, Jan 25, 2017 at 4:22 PM, Ismael Juma wrote: > An important question is if this needs to wait for a major release or not. > > Ismael > > On Thu, Jan 26, 2017 at 12:19 AM, Ismael Juma wrote: > > > +1 from me too. > > > > Ismael > > > > On Thu, Jan 26, 2017 at 12:07 AM, Ewen Cheslack-Po

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-25 Thread Dong Lin
lternate designs" design, even if you end > up deciding it's not the way to go. > > best, > Colin > > > On Thu, Jan 12, 2017, at 10:46, Dong Lin wrote: > > Hi all, > > > > We created KIP-112: Handle disk failure for JBOD. Please find the KIP > > wiki

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-24 Thread Dong Lin
Hey Mayuresh, Thanks for the KIP. I actually like the suggestions by Ismael and Jun. Here are my comments: 1. I am not sure we need to add the method buildPrincipal(Map principalConfigs). It seems that user can simply do principalBuilder.configure(...).buildPrincipal(...) without using that metho

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-01-24 Thread Dong Lin
quot;leader" replica. On Tue, Jan 24, 2017 at 3:30 AM, Alexey Ozeritsky wrote: > > > 23.01.2017, 22:11, "Dong Lin" : > > Thanks. Please see my comment inline. > > > > On Mon, Jan 23, 2017 at 6:45 AM, Alexey Ozeritsky > > wrote: > > > >>

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-23 Thread Dong Lin
rg/confluence/pages/diffpagesbyversion.action?pageId=67636826&selectedPageVersions=13&selectedPageVersions=14>. Please let me know if you have any concern with this change. Thanks, Dong On Mon, Jan 23, 2017 at 11:20 AM, Dong Lin wrote: > Thanks for the comment Jun. > > Yeah

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-23 Thread Dong Lin
seems that it's simpler to just have a > blocking api and returns Map? > > Thanks, > > Jun > > On Sun, Jan 22, 2017 at 3:56 PM, Dong Lin wrote: > > > Thanks for the comment Guozhang. Please don't worry about being late. I > > would like to update the KIP if

Re: [DISCUSS] KIP-113: Support replicas movement between log directories

2017-01-23 Thread Dong Lin
Thanks. Please see my comment inline. On Mon, Jan 23, 2017 at 6:45 AM, Alexey Ozeritsky wrote: > > > 13.01.2017, 22:29, "Dong Lin" : > > Hey Alexey, > > > > Thanks for your review and the alternative approach. Here is my > > understanding of your pa

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

2017-01-22 Thread Dong Lin
> 4. When broker had one of the dir failed, it can modify its " > /brokers/ids/[brokerId]" registry and remove the dir id, controller already > listening on this path can then be notified and run the replica assignment > accordingly where replica id is computed as above. >

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-22 Thread Dong Lin
te is called. Just wanted to point out > for the record that this approach may have some operational scenarios where > one of the replication files is missing and we need to treat them > specifically. > > > Guozhang > > > On Sun, Jan 22, 2017 at 1:56 PM, Dong Lin wrote: &

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-22 Thread Dong Lin
n that > only have one of the watermarks in case of a failure in between writing two > files. > > Guozhang > > On Sun, Jan 22, 2017 at 12:03 AM, Dong Lin wrote: > > > Hey Guozhang, > > > > Thanks for the review:) Yes it is possible to combine them. Both solu

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-22 Thread Dong Lin
point" > to let it have two values for each partition, i.e.: > > 1 // version number > [number of partitions] > [topic name] [partition id] [lwm] [hwm] > > > This will affects the upgrade path a bit, but I think not by large, and all > other logic will not be affe

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-18 Thread Dong Lin
On Wed, Jan 18, 2017 at 1:44 PM, Dong Lin wrote: > > > Hi Jun, > > > > After some more thinking, I agree with you that it is better to simply > > throw OffsetOutOfRangeException and not update low_watermark if > > offsetToPurge is larger than high_watermark. > >

<    1   2   3   4   5   6   7   8   9   10   >