Re: Consumer offsets partitions size much bigger than others

2017-07-24 Thread Luciano Afranllie
Thanks James

The issue was that the log cleaner was disabled in that cluster due to an
old configuration that we have since 0.9.0.0.

Regards
Luciano

On Tue, Jul 18, 2017 at 7:04 PM, James Cheng <wushuja...@gmail.com> wrote:

> It's possible that the log-cleaning thread has crashed. That is the thread
> that implements log compaction.
>
> Look in the log-cleaner.log file in your kafka debuglog directory to see
> if there is any indication that it has crashed (error messages, stack
> traces, etc).
>
> What version of kafka are you using? 0.10 and prior had some bugs in the
> log-cleaner thread that might sometimes cause it to crash. Those were fixed
> in later versions, but it's always possible there might still be more bugs
> there.
>
> I notice that your __consumer_offsets topic only has replication-factor=1.
> How many brokers are in your cluster? You should increase the replication
> factor to 3.
>
> Older versions of kafka would try to auto-create the __consumer_offsets
> topic with replication-factor 3 but if there were fewer than 3 brokers in
> the cluster, then they would simply use the number of brokers in the
> cluster. What that means is that if your cluster only had 1 broker running
> at the time the topic was auto-created, that it would be created with
> replication-factor 1. This has been fixed in later brokers, so that it will
> always create topics with the specified number of replicas or will throw
> loud errors in the event you don't have enough brokers.
>
> -James
>
> > On Jul 18, 2017, at 8:44 AM, Luciano Afranllie <listas.luaf...@gmail.com>
> wrote:
> >
> > Hi
> >
> > One of our Kafka brokers was running out of disk space and when we
> checked
> > the file size in the kafka log dir we observed the following
> >
> > $ du -h . --max-depth=2 | grep '__consumer_offsets'
> > 4.0K./kafka-logs/__consumer_offsets-16
> > 4.0K./kafka-logs/__consumer_offsets-40
> > 35G ./kafka-logs/__consumer_offsets-44
> > 4.0K./kafka-logs/__consumer_offsets-8
> > 4.0K./kafka-logs/__consumer_offsets-38
> > 4.0K./kafka-logs/__consumer_offsets-20
> > 4.0K./kafka-logs/__consumer_offsets-34
> > 4.0K./kafka-logs/__consumer_offsets-18
> > 4.0K./kafka-logs/__consumer_offsets-32
> > 251G./kafka-logs/__consumer_offsets-14
> > 4.0K./kafka-logs/__consumer_offsets-4
> > 4.0K./kafka-logs/__consumer_offsets-26
> > 4.0K./kafka-logs/__consumer_offsets-12
> > 4.0K./kafka-logs/__consumer_offsets-30
> > 4.0K./kafka-logs/__consumer_offsets-6
> > 4.0K./kafka-logs/__consumer_offsets-2
> > 4.0K./kafka-logs/__consumer_offsets-24
> > 4.0K./kafka-logs/__consumer_offsets-36
> > 4.0K./kafka-logs/__consumer_offsets-46
> > 4.0K./kafka-logs/__consumer_offsets-42
> > 4.0K./kafka-logs/__consumer_offsets-22
> > 4.0K./kafka-logs/__consumer_offsets-0
> > 4.0K./kafka-logs/__consumer_offsets-28
> > 4.0K./kafka-logs/__consumer_offsets-10
> > 4.0K./kafka-logs/__consumer_offsets-48
> >
> > As you can see, two of the log files (partition 44 and 14) have a huge
> > size. Do you have a hint to understand what could be happening here? May
> be
> > for some reason this partitions are not being compacted?
> >
> > By the way, this is the description of the __consumer_offsets topic.
> >
> > # ./bin/kafka-topics.sh --describe --zookeeper x.x.x.x:2181 --topic
> > __consumer_offsets
> > Topic:__consumer_offsetsPartitionCount:50
>  ReplicationFactor:1
> >
> > Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=
> uncompressed
> >Topic: __consumer_offsets   Partition: 0Leader: 1
> > Replicas: 1 Isr: 1
> >Topic: __consumer_offsets   Partition: 1Leader: 2
> > Replicas: 2 Isr: 2
> >Topic: __consumer_offsets   Partition: 2Leader: 1
> > Replicas: 1 Isr: 1
> >Topic: __consumer_offsets   Partition: 3Leader: 2
> > Replicas: 2 Isr: 2
> >Topic: __consumer_offsets   Partition: 4Leader: 1
> > Replicas: 1 Isr: 1
> >Topic: __consumer_offsets   Partition: 5Leader: 2
> > Replicas: 2 Isr: 2
> >Topic: __consumer_offsets   Partition: 6Leader: 1
> > Replicas: 1 Isr: 1
> >Topic: __consumer_offsets   Partition: 7Leader: 2
> > Replicas: 2 Isr: 2
> >Topic: __consumer_offsets   Partition: 8Leader: 1
> > Replicas: 1 Isr: 1
> >Topic: __consumer_offsets   Partition: 9Leader: 2
> > R

Consumer offsets partitions size much bigger than others

2017-07-18 Thread Luciano Afranllie
Hi

One of our Kafka brokers was running out of disk space and when we checked
the file size in the kafka log dir we observed the following

$ du -h . --max-depth=2 | grep '__consumer_offsets'
4.0K./kafka-logs/__consumer_offsets-16
4.0K./kafka-logs/__consumer_offsets-40
35G ./kafka-logs/__consumer_offsets-44
4.0K./kafka-logs/__consumer_offsets-8
4.0K./kafka-logs/__consumer_offsets-38
4.0K./kafka-logs/__consumer_offsets-20
4.0K./kafka-logs/__consumer_offsets-34
4.0K./kafka-logs/__consumer_offsets-18
4.0K./kafka-logs/__consumer_offsets-32
251G./kafka-logs/__consumer_offsets-14
4.0K./kafka-logs/__consumer_offsets-4
4.0K./kafka-logs/__consumer_offsets-26
4.0K./kafka-logs/__consumer_offsets-12
4.0K./kafka-logs/__consumer_offsets-30
4.0K./kafka-logs/__consumer_offsets-6
4.0K./kafka-logs/__consumer_offsets-2
4.0K./kafka-logs/__consumer_offsets-24
4.0K./kafka-logs/__consumer_offsets-36
4.0K./kafka-logs/__consumer_offsets-46
4.0K./kafka-logs/__consumer_offsets-42
4.0K./kafka-logs/__consumer_offsets-22
4.0K./kafka-logs/__consumer_offsets-0
4.0K./kafka-logs/__consumer_offsets-28
4.0K./kafka-logs/__consumer_offsets-10
4.0K./kafka-logs/__consumer_offsets-48

As you can see, two of the log files (partition 44 and 14) have a huge
size. Do you have a hint to understand what could be happening here? May be
for some reason this partitions are not being compacted?

By the way, this is the description of the __consumer_offsets topic.

# ./bin/kafka-topics.sh --describe --zookeeper x.x.x.x:2181 --topic
__consumer_offsets
Topic:__consumer_offsetsPartitionCount:50   ReplicationFactor:1

Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=uncompressed
Topic: __consumer_offsets   Partition: 0Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 1Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 2Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 3Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 4Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 5Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 6Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 7Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 8Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 9Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 10   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 11   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 12   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 13   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 14   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 15   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 16   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 17   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 18   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 19   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 20   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 21   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 22   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 23   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 24   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 25   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 26   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 27   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 28   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 29   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 30   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 31   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 32   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 33   Leader: 2
Replicas: 2 Isr: 2
Topic: __consumer_offsets   Partition: 34   Leader: 1
Replicas: 1 Isr: 1
Topic: __consumer_offsets   Partition: 35   Leader: 2
Replicas: 2 Isr: 2
   

Re: Trying to understand design decision about producer ack and min.insync.replicas

2017-01-26 Thread Luciano Afranllie
I was thinking about the situation where you have less brokers in the ISR
list than the number set in min.insync.replicas.

My idea was that if I, as an administrator, for a given topic, want to
favor durability over availability, then if that topic has less ISR than
the value set in min.insync.replicas I may want to stop producing to the
topic. In the way min.insync.replicas and ack work, I need to coordinate
with all producers in order to achieve this. There is no way (or I don't
know it) to globally enforce stop producing to a topic if it is under
replicated.

I don't see why, for the same topic, some producers might want get an error
when the number of ISR is below min.insync.replicas while other producers
don't. I think it could be more useful to be able to set that ALL producers
should get an error when a given topic is under replicated so they stop
producing, than for a single producer to get an error when ANY topic is
under replicated. I don't have a lot of experience with Kafka so I may be
missing some use cases.

But I understand your point, min.insync.replicas setting should be
understood as "if a producer wants to get an error when topics are under
replicated, then how many replicas are enough for not raising an error?"


On Thu, Jan 26, 2017 at 4:16 PM, Ewen Cheslack-Postava <e...@confluent.io>
wrote:

> The acks setting for the producer doesn't affect the final durability
> guarantees. These are still enforced by the replication and min ISR
> settings. Instead, the ack setting just lets the producer control how
> durable the write is before *that producer* can consider the write
> "complete", i.e. before it gets an ack.
>
> -Ewen
>
> On Tue, Jan 24, 2017 at 12:46 PM, Luciano Afranllie <
> listas.luaf...@gmail.com> wrote:
>
> > Hi everybody
> >
> > I am trying to understand why Kafka let each individual producer, on a
> > connection per connection basis, choose the tradeoff between availability
> > and durability, honoring min.insync.replicas value only if producer uses
> > ack=all.
> >
> > I mean, for a single topic, cluster administrators can't enforce messages
> > to be stores in a minimum number of replicas without coordinating with
> all
> > producers to that topic so all of them use ack=all.
> >
> > Is there something that I am missing? Is there any other strategy to
> > overcome this situation?
> >
> > Regards
> > Luciano
> >
>


Trying to understand design decision about producer ack and min.insync.replicas

2017-01-24 Thread Luciano Afranllie
Hi everybody

I am trying to understand why Kafka let each individual producer, on a
connection per connection basis, choose the tradeoff between availability
and durability, honoring min.insync.replicas value only if producer uses
ack=all.

I mean, for a single topic, cluster administrators can't enforce messages
to be stores in a minimum number of replicas without coordinating with all
producers to that topic so all of them use ack=all.

Is there something that I am missing? Is there any other strategy to
overcome this situation?

Regards
Luciano


Re: [GitHub] kafka pull request #1661: KAFKA-3987: Allow config of the hash algorithm use...

2016-08-04 Thread Luciano Afranllie
Hi

Could you please tell me if this pull request can be merged and if so when
can we expect this to happen?

Thanks
Luciano

On Tue, Jul 26, 2016 at 9:01 AM, Luciano Afranllie <listas.luaf...@gmail.com
> wrote:

> Hi
>
> Could somebody with commit permission review (and eventually merge) this
> pull request?
>
> Regards
> Luciano
>
> On Mon, Jul 25, 2016 at 11:49 AM, luafran <g...@git.apache.org> wrote:
>
>> GitHub user luafran opened a pull request:
>>
>> https://github.com/apache/kafka/pull/1661
>>
>> KAFKA-3987: Allow config of the hash algorithm used by the log cleaner
>>
>> Allow configuration of the hash algorithm used by the Log Cleaner's
>> offset map
>>
>> You can merge this pull request into a Git repository by running:
>>
>> $ git pull https://github.com/luafran/kafka
>> config-for-log-cleaner-hash-algo
>>
>> Alternatively you can review and apply these changes as the patch at:
>>
>> https://github.com/apache/kafka/pull/1661.patch
>>
>> To close this pull request, make a commit to your master/trunk branch
>> with (at least) the following in the commit message:
>>
>> This closes #1661
>>
>> 
>> commit 2e7e507903c73740ca498405c5680a8c528ccda6
>> Author: Luciano Afranllie <luaf...@gmail.com>
>> Date:   2016-07-25T14:39:59Z
>>
>> KAFKA-3987: Allow configuration of the hash algorithm used by the
>> LogCleaner's offset map
>>
>> 
>>
>>
>> ---
>> If your project is set up for it, you can reply to this email and have
>> your
>> reply appear on GitHub as well. If your project does not have this feature
>> enabled and wishes so, or if the feature is enabled but not working,
>> please
>> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
>> with INFRA.
>> ---
>>
>
>


[jira] [Updated] (KAFKA-3987) Allow configuration of the hash algorithm used by the LogCleaner's offset map

2016-07-27 Thread Luciano Afranllie (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luciano Afranllie updated KAFKA-3987:
-
Reviewer: Shikhar Bhushan
  Status: Patch Available  (was: Open)

> Allow configuration of the hash algorithm used by the LogCleaner's offset map
> -
>
> Key: KAFKA-3987
> URL: https://issues.apache.org/jira/browse/KAFKA-3987
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>    Reporter: Luciano Afranllie
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> In order to be able to do deployments of Kafka that are FIPS 140-2 
> (https://en.wikipedia.org/wiki/FIPS_140-2) complaint one of the requirements 
> is not to use MD5.
> Kafka is using MD5 to hash message keys in the offset map (SkimpyOffsetMap) 
> used by the log cleaner.
> The idea is to be able to configure this hash algorithm to something allowed 
> by FIPS using a new configuration property.
> The property could be named "log.cleaner.hash.algorithm" with a default value 
> equal to "MD5" and the idea is to use it in the constructor of CleanerConfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [GitHub] kafka pull request #1661: KAFKA-3987: Allow config of the hash algorithm use...

2016-07-26 Thread Luciano Afranllie
Hi

Could somebody with commit permission review (and eventually merge) this
pull request?

Regards
Luciano

On Mon, Jul 25, 2016 at 11:49 AM, luafran <g...@git.apache.org> wrote:

> GitHub user luafran opened a pull request:
>
> https://github.com/apache/kafka/pull/1661
>
> KAFKA-3987: Allow config of the hash algorithm used by the log cleaner
>
> Allow configuration of the hash algorithm used by the Log Cleaner's
> offset map
>
> You can merge this pull request into a Git repository by running:
>
> $ git pull https://github.com/luafran/kafka
> config-for-log-cleaner-hash-algo
>
> Alternatively you can review and apply these changes as the patch at:
>
> https://github.com/apache/kafka/pull/1661.patch
>
> To close this pull request, make a commit to your master/trunk branch
> with (at least) the following in the commit message:
>
> This closes #1661
>
> 
> commit 2e7e507903c73740ca498405c5680a8c528ccda6
> Author: Luciano Afranllie <luaf...@gmail.com>
> Date:   2016-07-25T14:39:59Z
>
> KAFKA-3987: Allow configuration of the hash algorithm used by the
> LogCleaner's offset map
>
> 
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: Changing hash algorithm to LogCleaner offset map

2016-07-24 Thread Luciano Afranllie
Thanks Shikhar.

I have created KAFKA-3987 (https://issues.apache.org/jira/browse/KAFKA-3987).
Could anybody please assign that issue to me? I could not do it. I have a
patch ready and will open a pull request tomorrow.

Regards


On Sun, Jul 24, 2016 at 4:45 PM, Shikhar Bhushan <shik...@confluent.io>
wrote:

> Got it, makes sense to make the hash function customizable if there are
> environments in which md5 usage is prevented. The approach you are
> proposing sounds good to me.
> On Sat, Jul 23, 2016 at 14:56 Luciano Afranllie <listas.luaf...@gmail.com>
> wrote:
>
> > Nothing wrong about using MD5 for that from FIPS point of view, but we
> want
> > to deploy with FIPS 140-2 mode enabled using only RSA security providers.
> > With this settings it is not possible to use MD5.
> >
> > On Fri, Jul 22, 2016 at 8:49 PM, Shikhar Bhushan <shik...@confluent.io>
> > wrote:
> >
> > > Not sure I understand the motivation to use a FIPS-compliant hash
> > function
> > > for log compaction -- what are the security ramifications?
> > >
> > > On Fri, Jul 22, 2016 at 2:56 PM Luciano Afranllie <
> > > listas.luaf...@gmail.com>
> > > wrote:
> > >
> > > > A little bit of background first.
> > > >
> > > > We are trying to make a deployment of Kafka that is FIPS 140-2 (
> > > > https://en.wikipedia.org/wiki/FIPS_140-2) complaint and one of the
> > > > requirements is not to use MD5.
> > > >
> > > > As far as we could see, Kafka is using MD5 only to hash message keys
> > in a
> > > > offset map (SkimpyOffsetMap) used by the log cleaner. So, we are
> > planning
> > > > to change the hash algorithm to something allowed by FIPS.
> > > >
> > > > With this in mind we are thinking that it would be great if we can
> add
> > a
> > > > config property LogCleanerHashAlgorithmProp =
> > > "log.cleaner.hash.algorithm"
> > > > with a default value equal to "MD5" and use it in the constructor
> > > > of CleanerConfig. In that case in future versions of Kafka we can
> just
> > > > change the value of this property.
> > > >
> > > > Please let me know if you are Ok with this change.
> > > > It is enough to create a pull request for this? Should I create a
> Jira
> > > > first?
> > > >
> > > > Regards
> > > > Luciano
> > > >
> > > > On Fri, Jul 22, 2016 at 5:58 PM, Luciano Afranllie <
> > > > listas.luaf...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > We are evaluating to change the hash algorithm used by the
> > > > SkimpyOffsetMap
> > > > > used by the LogCleaner from MD5 to SHA-1.
> > > > >
> > > > > Besides the impact in performance (more memory, more cpu usage) is
> > > there
> > > > > anything that may be impacted?
> > > > >
> > > > > Regards
> > > > > Luciano
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (KAFKA-3987) Allow configuration of the hash algorithm used by the LogCleaner's offset map

2016-07-24 Thread Luciano Afranllie (JIRA)
Luciano Afranllie created KAFKA-3987:


 Summary: Allow configuration of the hash algorithm used by the 
LogCleaner's offset map
 Key: KAFKA-3987
 URL: https://issues.apache.org/jira/browse/KAFKA-3987
 Project: Kafka
  Issue Type: Improvement
  Components: config
Reporter: Luciano Afranllie
Priority: Minor
 Fix For: 0.10.1.0


In order to be able to do deployments of Kafka that are FIPS 140-2 
(https://en.wikipedia.org/wiki/FIPS_140-2) complaint one of the requirements is 
not to use MD5.

Kafka is using MD5 to hash message keys in the offset map (SkimpyOffsetMap) 
used by the log cleaner.

The idea is to be able to configure this hash algorithm to something allowed by 
FIPS using a new configuration property.

The property could be named "log.cleaner.hash.algorithm" with a default value 
equal to "MD5" and the idea is to use it in the constructor of CleanerConfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Changing hash algorithm to LogCleaner offset map

2016-07-23 Thread Luciano Afranllie
Nothing wrong about using MD5 for that from FIPS point of view, but we want
to deploy with FIPS 140-2 mode enabled using only RSA security providers.
With this settings it is not possible to use MD5.

On Fri, Jul 22, 2016 at 8:49 PM, Shikhar Bhushan <shik...@confluent.io>
wrote:

> Not sure I understand the motivation to use a FIPS-compliant hash function
> for log compaction -- what are the security ramifications?
>
> On Fri, Jul 22, 2016 at 2:56 PM Luciano Afranllie <
> listas.luaf...@gmail.com>
> wrote:
>
> > A little bit of background first.
> >
> > We are trying to make a deployment of Kafka that is FIPS 140-2 (
> > https://en.wikipedia.org/wiki/FIPS_140-2) complaint and one of the
> > requirements is not to use MD5.
> >
> > As far as we could see, Kafka is using MD5 only to hash message keys in a
> > offset map (SkimpyOffsetMap) used by the log cleaner. So, we are planning
> > to change the hash algorithm to something allowed by FIPS.
> >
> > With this in mind we are thinking that it would be great if we can add a
> > config property LogCleanerHashAlgorithmProp =
> "log.cleaner.hash.algorithm"
> > with a default value equal to "MD5" and use it in the constructor
> > of CleanerConfig. In that case in future versions of Kafka we can just
> > change the value of this property.
> >
> > Please let me know if you are Ok with this change.
> > It is enough to create a pull request for this? Should I create a Jira
> > first?
> >
> > Regards
> > Luciano
> >
> > On Fri, Jul 22, 2016 at 5:58 PM, Luciano Afranllie <
> > listas.luaf...@gmail.com
> > > wrote:
> >
> > > Hi
> > >
> > > We are evaluating to change the hash algorithm used by the
> > SkimpyOffsetMap
> > > used by the LogCleaner from MD5 to SHA-1.
> > >
> > > Besides the impact in performance (more memory, more cpu usage) is
> there
> > > anything that may be impacted?
> > >
> > > Regards
> > > Luciano
> > >
> >
>


Re: Changing hash algorithm to LogCleaner offset map

2016-07-22 Thread Luciano Afranllie
A little bit of background first.

We are trying to make a deployment of Kafka that is FIPS 140-2 (
https://en.wikipedia.org/wiki/FIPS_140-2) complaint and one of the
requirements is not to use MD5.

As far as we could see, Kafka is using MD5 only to hash message keys in a
offset map (SkimpyOffsetMap) used by the log cleaner. So, we are planning
to change the hash algorithm to something allowed by FIPS.

With this in mind we are thinking that it would be great if we can add a
config property LogCleanerHashAlgorithmProp = "log.cleaner.hash.algorithm"
with a default value equal to "MD5" and use it in the constructor
of CleanerConfig. In that case in future versions of Kafka we can just
change the value of this property.

Please let me know if you are Ok with this change.
It is enough to create a pull request for this? Should I create a Jira
first?

Regards
Luciano

On Fri, Jul 22, 2016 at 5:58 PM, Luciano Afranllie <listas.luaf...@gmail.com
> wrote:

> Hi
>
> We are evaluating to change the hash algorithm used by the SkimpyOffsetMap
> used by the LogCleaner from MD5 to SHA-1.
>
> Besides the impact in performance (more memory, more cpu usage) is there
> anything that may be impacted?
>
> Regards
> Luciano
>


Changing hash algorithm to LogCleaner offset map

2016-07-22 Thread Luciano Afranllie
Hi

We are evaluating to change the hash algorithm used by the SkimpyOffsetMap
used by the LogCleaner from MD5 to SHA-1.

Besides the impact in performance (more memory, more cpu usage) is there
anything that may be impacted?

Regards
Luciano


Re: Use Kafka 0.9.0.1 or 0.10.0.0 with Zookeeper 3.5

2016-07-11 Thread Luciano Afranllie
Hi

I was able to run Kafka 0.9 using zookeeper-3.5.2-alpha with SSL enabled. I
did not do a deep testing but was able to run Kafka and produce/consume
using console tools.

What I did is to build zkClient with zookeeper-3.5.2-alpha and then build
Kafka with this new zkClient, zookeeper-3.5.2-alpha and Netty.
After that, I modified Zookeeper and Kafka start scripts and config to use
SSL as described in ZooKeeper SSL User Guide
<https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+SSL+User+Guide>

If you think it is useful I can add a detailed description in Kafka or some
other wiki.

Regards
Luciano

On Thu, Jun 30, 2016 at 8:42 AM, Luciano Afranllie <listas.luaf...@gmail.com
> wrote:

> Thanks Flavio
>
> Your last statement is the key here. What we really need is to secure
> connections between Kafka and zk and between zk nodes. Our understanding so
> far is that zk 3.5 will allow us to do that so we want to understand what
> we need to do in Kafka side in order to use TLS connections to Zookeeper.
>
> Regards
> Luciano
>
>
> On Wed, Jun 29, 2016 at 7:10 PM, Flavio Junqueira <f...@apache.org> wrote:
>
>> Hi Luciano,
>>
>> I can't remember seeing a discussion in this community about 3.5. I
>> suspect your real question is how to set it up to use TLS/SSL because
>> everything else should remain the same. You should be able to run a Kafka
>> cluster as is against a 3.5 ensemble, but you won't have a secure
>> connection to zk.
>>
>> -Flavio
>>
>> > On 29 Jun 2016, at 22:16, Luciano Afranllie <listas.luaf...@gmail.com>
>> wrote:
>> >
>> > Hi
>> >
>> > I would like some advice about what are the changes, at high level,
>> > required to use kafka 0.9.0.1 or 0.10.0.0 with Zookeeper 3.5.x
>> (3.5.1-alpha
>> > for example) using TLS/SSL. How big are the changes required in Kafka in
>> > order to be able to do this?
>> >
>> > Regards
>> > Luciano
>>
>>
>


Re: Use Kafka 0.9.0.1 or 0.10.0.0 with Zookeeper 3.5

2016-06-30 Thread Luciano Afranllie
Thanks Flavio

Your last statement is the key here. What we really need is to secure
connections between Kafka and zk and between zk nodes. Our understanding so
far is that zk 3.5 will allow us to do that so we want to understand what
we need to do in Kafka side in order to use TLS connections to Zookeeper.

Regards
Luciano


On Wed, Jun 29, 2016 at 7:10 PM, Flavio Junqueira <f...@apache.org> wrote:

> Hi Luciano,
>
> I can't remember seeing a discussion in this community about 3.5. I
> suspect your real question is how to set it up to use TLS/SSL because
> everything else should remain the same. You should be able to run a Kafka
> cluster as is against a 3.5 ensemble, but you won't have a secure
> connection to zk.
>
> -Flavio
>
> > On 29 Jun 2016, at 22:16, Luciano Afranllie <listas.luaf...@gmail.com>
> wrote:
> >
> > Hi
> >
> > I would like some advice about what are the changes, at high level,
> > required to use kafka 0.9.0.1 or 0.10.0.0 with Zookeeper 3.5.x
> (3.5.1-alpha
> > for example) using TLS/SSL. How big are the changes required in Kafka in
> > order to be able to do this?
> >
> > Regards
> > Luciano
>
>


Use Kafka 0.9.0.1 or 0.10.0.0 with Zookeeper 3.5

2016-06-29 Thread Luciano Afranllie
Hi

I would like some advice about what are the changes, at high level,
required to use kafka 0.9.0.1 or 0.10.0.0 with Zookeeper 3.5.x (3.5.1-alpha
for example) using TLS/SSL. How big are the changes required in Kafka in
order to be able to do this?

Regards
Luciano


What about Audit feature?

2016-01-11 Thread Luciano Afranllie
Hi

Kafka documentation mention Audit feature in section 6.6 but
https://issues.apache.org/jira/browse/KAFKA-260 is resolved Won't Fix.

Should this section of the documentation be removed?

Regards
Luciano


Status of multi-tenancy

2016-01-08 Thread Luciano Afranllie
Hi there

We are interested in adding support for multiple tenants into Kafka and I
reached to this thread

http://grokbase.com/t/kafka/dev/154wsscrsk/adding-multi-tenancy-capabilities-to-kafka

Could you please let me know the status of this proposal?

Is this something we can move forward?

Regards
Luciano