[jira] [Commented] (KAFKA-8338) Improve consumer offset expiration logic to take subscription into account

2019-05-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839998#comment-16839998
 ] 

ASF GitHub Bot commented on KAFKA-8338:
---

huxihx commented on pull request #6737: KAFKA-8338: consumer offset expiration 
should consider subscription.
URL: https://github.com/apache/kafka/pull/6737
 
 
   https://issues.apache.org/jira/browse/KAFKA-8338
   
   Currently only empty groups will be checked to seek any expired offsets. 
However, if a group is in Stable state but no longer subscribes any partitions, 
the offsets for these partitions will never be removed.
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve consumer offset expiration logic to take subscription into account
> --
>
> Key: KAFKA-8338
> URL: https://issues.apache.org/jira/browse/KAFKA-8338
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Gwen Shapira
>Assignee: huxihx
>Priority: Major
>
> Currently, we expire consumer offsets for a group after the group is 
> considered gone.
> There is a case where the consumer group still exists, but is now subscribed 
> to different topics. In that case, the offsets of the old topics will never 
> expire and if lag is monitored, the monitors will show ever-growing lag on 
> those topics. 
> We need to improve the logic to expire the consumer offsets if the consumer 
> group didn't subscribe to specific topics/partitions for enough time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8338) Improve consumer offset expiration logic to take subscription into account

2019-05-14 Thread huxihx (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxihx reassigned KAFKA-8338:
-

Assignee: huxihx

> Improve consumer offset expiration logic to take subscription into account
> --
>
> Key: KAFKA-8338
> URL: https://issues.apache.org/jira/browse/KAFKA-8338
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Gwen Shapira
>Assignee: huxihx
>Priority: Major
>
> Currently, we expire consumer offsets for a group after the group is 
> considered gone.
> There is a case where the consumer group still exists, but is now subscribed 
> to different topics. In that case, the offsets of the old topics will never 
> expire and if lag is monitored, the monitors will show ever-growing lag on 
> those topics. 
> We need to improve the logic to expire the consumer offsets if the consumer 
> group didn't subscribe to specific topics/partitions for enough time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8361) Fix ConsumerPerformanceTest#testNonDetailedHeaderMatchBody to test a real ConsumerPerformance's method

2019-05-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839982#comment-16839982
 ] 

ASF GitHub Bot commented on KAFKA-8361:
---

sekikn commented on pull request #6736: KAFKA-8361. Fix 
ConsumerPerformanceTest#testNonDetailedHeaderMatchBody to test a real 
ConsumerPerformance's method
URL: https://github.com/apache/kafka/pull/6736
 
 
   kafka.tools.ConsumerPerformanceTest#testNonDetailedHeaderMatchBody
   doesn't work as a regression test, since it checks the number of the
   fields which are output by an inline `println`, not by a real method of
   ConsumerPerformance. This PR makes ConsumerPerformance's output logic
   an independent method and testNonDetailedHeaderMatchBody test it.
   It also includes some formatting fixes.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix ConsumerPerformanceTest#testNonDetailedHeaderMatchBody to test a real 
> ConsumerPerformance's method
> --
>
> Key: KAFKA-8361
> URL: https://issues.apache.org/jira/browse/KAFKA-8361
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>
> {{kafka.tools.ConsumerPerformanceTest#testNonDetailedHeaderMatchBody}} 
> doesn't work as a regression test for now, since it tests an anonymous 
> function defined in the test method itself.
> {code:java}
>   @Test
>   def testNonDetailedHeaderMatchBody(): Unit = {
> testHeaderMatchContent(detailed = false, 2, () => 
> println(s"${dateFormat.format(System.currentTimeMillis)}, " +
>   s"${dateFormat.format(System.currentTimeMillis)}, 1.0, 1.0, 1, 1.0, 1, 
> 1, 1.1, 1.1"))
>   }
> {code}
> It should test a real {{ConsumerPerformance}}'s method, just like 
> {{testDetailedHeaderMatchBody}}.
> {code:java}
>   @Test
>   def testDetailedHeaderMatchBody(): Unit = {
> testHeaderMatchContent(detailed = true, 2,
>   () => ConsumerPerformance.printConsumerProgress(1, 1024 * 1024, 0, 1, 
> 0, 0, 1, dateFormat, 1L))
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7847) KIP-421: Automatically resolve external configurations.

2019-05-14 Thread Randall Hauch (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839926#comment-16839926
 ] 

Randall Hauch commented on KAFKA-7847:
--

See https://github.com/apache/kafka/pull/6467 for the PR.

> KIP-421: Automatically resolve external configurations.
> ---
>
> Key: KAFKA-7847
> URL: https://issues.apache.org/jira/browse/KAFKA-7847
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Reporter: TEJAL ADSUL
>Priority: Minor
>  Labels: needs-kip
> Fix For: 2.3.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> This proposal intends to enhance the AbstractConfig base class to support 
> replacing variables in configurations just prior to parsing and validation. 
> This simple change will make it very easy for client applications, Kafka 
> Connect, and Kafka Streams to use shared code to easily incorporate 
> externalized secrets and other variable replacements within their 
> configurations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7847) KIP-421: Automatically resolve external configurations.

2019-05-14 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-7847.
--
Resolution: Fixed
  Reviewer: Randall Hauch

> KIP-421: Automatically resolve external configurations.
> ---
>
> Key: KAFKA-7847
> URL: https://issues.apache.org/jira/browse/KAFKA-7847
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Reporter: TEJAL ADSUL
>Priority: Minor
>  Labels: needs-kip
> Fix For: 2.3.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> This proposal intends to enhance the AbstractConfig base class to support 
> replacing variables in configurations just prior to parsing and validation. 
> This simple change will make it very easy for client applications, Kafka 
> Connect, and Kafka Streams to use shared code to easily incorporate 
> externalized secrets and other variable replacements within their 
> configurations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7847) KIP-421: Automatically resolve external configurations.

2019-05-14 Thread Randall Hauch (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839925#comment-16839925
 ] 

Randall Hauch commented on KAFKA-7847:
--

See 
[KIP-421|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100829515]

> KIP-421: Automatically resolve external configurations.
> ---
>
> Key: KAFKA-7847
> URL: https://issues.apache.org/jira/browse/KAFKA-7847
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Reporter: TEJAL ADSUL
>Priority: Minor
>  Labels: needs-kip
> Fix For: 2.3.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> This proposal intends to enhance the AbstractConfig base class to support 
> replacing variables in configurations just prior to parsing and validation. 
> This simple change will make it very easy for client applications, Kafka 
> Connect, and Kafka Streams to use shared code to easily incorporate 
> externalized secrets and other variable replacements within their 
> configurations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7847) KIP-421: Automatically resolve external configurations.

2019-05-14 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7847:
-
Labels: needs-kip  (was: )

> KIP-421: Automatically resolve external configurations.
> ---
>
> Key: KAFKA-7847
> URL: https://issues.apache.org/jira/browse/KAFKA-7847
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Reporter: TEJAL ADSUL
>Priority: Minor
>  Labels: needs-kip
> Fix For: 2.3.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> This proposal intends to enhance the AbstractConfig base class to support 
> replacing variables in configurations just prior to parsing and validation. 
> This simple change will make it very easy for client applications, Kafka 
> Connect, and Kafka Streams to use shared code to easily incorporate 
> externalized secrets and other variable replacements within their 
> configurations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8366) partitions of topics being deleted show up in the offline partitions metric

2019-05-14 Thread radai rosenblatt (JIRA)
radai rosenblatt created KAFKA-8366:
---

 Summary: partitions of topics being deleted show up in the offline 
partitions metric
 Key: KAFKA-8366
 URL: https://issues.apache.org/jira/browse/KAFKA-8366
 Project: Kafka
  Issue Type: Improvement
Reporter: radai rosenblatt


i believe this is a bug
offline partitions is a metric that indicates an error condition - lack of 
kafka availability.
as an artifact of how deletion is implemented the partitions for a topic 
undergoing deletion will show up as offline, which just creates false-positive 
alerts.

if needed, maybe there should exist a separate "partitions to be deleted" 
sensor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7994) Improve Stream-Time for rebalances and restarts

2019-05-14 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839868#comment-16839868
 ] 

Richard Yu commented on KAFKA-7994:
---

Hi all, could we get some reviews on the PR? There hasn't been too much 
activity as of late, so I thought it would be good if we could get some 
thoughts.  

 

> Improve Stream-Time for rebalances and restarts
> ---
>
> Key: KAFKA-7994
> URL: https://issues.apache.org/jira/browse/KAFKA-7994
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Richard Yu
>Priority: Major
> Attachments: possible-patch.diff
>
>
> We compute a per-partition partition-time as the maximum timestamp over all 
> records processed so far. Furthermore, we use partition-time to compute 
> stream-time for each task as maximum over all partition-times (for all 
> corresponding task partitions). This stream-time is used to make decisions 
> about processing out-of-order records or drop them if they are late (ie, 
> timestamp < stream-time - grace-period).
> During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, 
> -1) for tasks that are newly created (or migrated). In net effect, we forget 
> current stream-time for this case what may lead to non-deterministic behavior 
> if we stop processing right before a late record, that would be dropped if we 
> continue processing, but is not dropped after rebalance/restart. Let's look 
> at an examples with a grade period of 5ms for a tumbling windowed of 5ms, and 
> the following records (timestamps in parenthesis):
>  
> {code:java}
> r1(0) r2(5) r3(11) r4(2){code}
> In the example, stream-time advances as 0, 5, 11, 11  and thus record `r4` is 
> dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or 
> rebalance after processing `r3` but before processing `r4`, we would 
> reinitialize stream-time as -1, and thus would process `r4` on restart/after 
> rebalance. The problem is, that stream-time does advance differently from a 
> global point of view: 0, 5, 11, 2.
>  
> Note, this is a corner case, because if we would stop processing one record 
> earlier, ie, after processing `r2` but before processing `r3`, stream-time 
> would be advance correctly from a global point of view.
> A potential fix would be, to store latest observed partition-time in the 
> metadata of committed offsets. Thus way, on restart/rebalance we can 
> re-initialize time correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6951) Implement offset expiration semantics for unsubscribed topics

2019-05-14 Thread Vahid Hashemian (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839753#comment-16839753
 ] 

Vahid Hashemian commented on KAFKA-6951:


Hi [~apovzner]. Thanks for catching this. I added a comment in the KIP to point 
this out, but I feel that's not enough.

Perhaps a follow up KIP for that remaining feature makes more sense?

> Implement offset expiration semantics for unsubscribed topics
> -
>
> Key: KAFKA-6951
> URL: https://issues.apache.org/jira/browse/KAFKA-6951
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Major
>
> [This 
> portion|https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets#KIP-211:ReviseExpirationSemanticsofConsumerGroupOffsets-UnsubscribingfromaTopic]
>  of KIP-211 will be implemented separately from the main PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8363) Config provider parsing is broken

2019-05-14 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-8363.
--
Resolution: Fixed
  Reviewer: Randall Hauch

> Config provider parsing is broken
> -
>
> Key: KAFKA-8363
> URL: https://issues.apache.org/jira/browse/KAFKA-8363
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0, 2.1.1
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
> Fix For: 2.0.2, 2.3.0, 2.1.2, 2.2.1
>
>
> The 
> [regex|https://github.com/apache/kafka/blob/63e4f67d9ba9e08bdce705b35c5acf32dcd20633/clients/src/main/java/org/apache/kafka/common/config/ConfigTransformer.java#L56]
>  used by the {{ConfigTransformer}} class to parse config provider syntax (see 
> [KIP-279|https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations])
>  is broken and fails when multiple path-less configs are specified. For 
> example: {{"${provider:configOne} ${provider:configTwo}"}} would be parsed 
> incorrectly as a reference with a path of {{"configOne} $\{provider"}}. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8363) Config provider parsing is broken

2019-05-14 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8363:
-
Fix Version/s: 2.2.1
   2.1.2
   2.3.0
   2.0.2

> Config provider parsing is broken
> -
>
> Key: KAFKA-8363
> URL: https://issues.apache.org/jira/browse/KAFKA-8363
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0, 2.1.1
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
> Fix For: 2.0.2, 2.3.0, 2.1.2, 2.2.1
>
>
> The 
> [regex|https://github.com/apache/kafka/blob/63e4f67d9ba9e08bdce705b35c5acf32dcd20633/clients/src/main/java/org/apache/kafka/common/config/ConfigTransformer.java#L56]
>  used by the {{ConfigTransformer}} class to parse config provider syntax (see 
> [KIP-279|https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations])
>  is broken and fails when multiple path-less configs are specified. For 
> example: {{"${provider:configOne} ${provider:configTwo}"}} would be parsed 
> incorrectly as a reference with a path of {{"configOne} $\{provider"}}. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8365) Protocol and consumer support for follower fetching

2019-05-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839727#comment-16839727
 ] 

ASF GitHub Bot commented on KAFKA-8365:
---

mumrah commented on pull request #6731: KAFKA-8365 Consumer support for 
follower fetch
URL: https://github.com/apache/kafka/pull/6731
 
 
   Support for preferred read replica in consumer client
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Protocol and consumer support for follower fetching
> ---
>
> Key: KAFKA-8365
> URL: https://issues.apache.org/jira/browse/KAFKA-8365
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Major
> Fix For: 2.3.0
>
>
> Add the consumer client changes and implement the protocol support for 
> [KIP-392|https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8363) Config provider parsing is broken

2019-05-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839724#comment-16839724
 ] 

ASF GitHub Bot commented on KAFKA-8363:
---

rhauch commented on pull request #6726: KAFKA-8363: Fix parsing bug for config 
providers
URL: https://github.com/apache/kafka/pull/6726
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Config provider parsing is broken
> -
>
> Key: KAFKA-8363
> URL: https://issues.apache.org/jira/browse/KAFKA-8363
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0, 2.1.1
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
>
> The 
> [regex|https://github.com/apache/kafka/blob/63e4f67d9ba9e08bdce705b35c5acf32dcd20633/clients/src/main/java/org/apache/kafka/common/config/ConfigTransformer.java#L56]
>  used by the {{ConfigTransformer}} class to parse config provider syntax (see 
> [KIP-279|https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations])
>  is broken and fails when multiple path-less configs are specified. For 
> example: {{"${provider:configOne} ${provider:configTwo}"}} would be parsed 
> incorrectly as a reference with a path of {{"configOne} $\{provider"}}. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8365) Protocol and consumer support for follower fetching

2019-05-14 Thread David Arthur (JIRA)
David Arthur created KAFKA-8365:
---

 Summary: Protocol and consumer support for follower fetching
 Key: KAFKA-8365
 URL: https://issues.apache.org/jira/browse/KAFKA-8365
 Project: Kafka
  Issue Type: New Feature
  Components: consumer
Reporter: David Arthur
Assignee: David Arthur
 Fix For: 2.3.0


Add the consumer client changes and implement the protocol support for 
[KIP-392|https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839711#comment-16839711
 ] 

Andrew commented on KAFKA-8315:
---

I constructed a super-ugly workaround for my \{{join-example}} demo app. It 
adds a transformer onto each of the left and right streams, and they refer to 
each other's streamTime to decide whether to forward the messages. Not the 
'right' solution, but I might be able to fix this up to serve as a workaround. 
Need to ensure that the right transformers pair up depending on their assigned 
partition, and maybe use a state store. Its a hack but it might just work for 
now.

 

https://github.com/the4thamigo-uk/join-example/pull/1/files

> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more apparent where one 
> topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> [https://github.com/the4thamigo-uk/join-example]
>  
> The only way to increase the period of historically joined records is to 
> increase the grace period for the join windows, and this has repercussions 
> when you extend it to a large period e.g. 2 years of minute-by-minute records.
> Related slack conversations : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]
>  
>  Research on this issue has gone through a few phases :
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
> This was considered to be a problem with the documentation not with the API 
> and is addressed in [https://github.com/apache/kafka/pull/6664]
> 2) We then found an apparent issue in the code which would affect the 
> partition that is selected to deliver the next record to the join. This would 
> only be a problem for data that is out-of-order, and join-example uses data 
> that is in order of timestamp in both topics. So this fix is thought not to 
> affect join-example.
> This was considered to be an issue and is being addressed in 
> [https://github.com/apache/kafka/pull/6719]
>  3) Further investigation using a crafted unit test seems to show that the 
> partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok
> [https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]
> 4) the current assumption is that the issue is rooted in the way records are 
> consumed from the topics :
> We have tried to set various options to suppress reads form the source topics 
> but it doesnt seem to make any difference : 
> [https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8354) Replace SyncGroup request/response with automated protocol

2019-05-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839707#comment-16839707
 ] 

ASF GitHub Bot commented on KAFKA-8354:
---

abbccdda commented on pull request #6729: KAFKA-8354: Replace Sync group 
request/response with automated protocol
URL: https://github.com/apache/kafka/pull/6729
 
 
   As part of https://issues.apache.org/jira/browse/KAFKA-7830
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace SyncGroup request/response with automated protocol
> --
>
> Key: KAFKA-8354
> URL: https://issues.apache.org/jira/browse/KAFKA-8354
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839703#comment-16839703
 ] 

Andrew commented on KAFKA-8315:
---

I constructed a super-ugly workaround for my \{{join-example}} demo app. It 
adds a transformer onto each of the left and right streams, and they refer to 
each other's streamTime to decide whether to forward the messages. Not the 
'right' solution, but I might be able to fix this up to serve as a workaround. 
Need to ensure that the right transformers pair up depending on their assigned 
partition, and maybe use a state store. Its a hack but it might just work for 
now.

 

https://github.com/the4thamigo-uk/join-example/pull/1/files

> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more apparent where one 
> topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> [https://github.com/the4thamigo-uk/join-example]
>  
> The only way to increase the period of historically joined records is to 
> increase the grace period for the join windows, and this has repercussions 
> when you extend it to a large period e.g. 2 years of minute-by-minute records.
> Related slack conversations : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]
>  
>  Research on this issue has gone through a few phases :
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
> This was considered to be a problem with the documentation not with the API 
> and is addressed in [https://github.com/apache/kafka/pull/6664]
> 2) We then found an apparent issue in the code which would affect the 
> partition that is selected to deliver the next record to the join. This would 
> only be a problem for data that is out-of-order, and join-example uses data 
> that is in order of timestamp in both topics. So this fix is thought not to 
> affect join-example.
> This was considered to be an issue and is being addressed in 
> [https://github.com/apache/kafka/pull/6719]
>  3) Further investigation using a crafted unit test seems to show that the 
> partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok
> [https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]
> 4) the current assumption is that the issue is rooted in the way records are 
> consumed from the topics :
> We have tried to set various options to suppress reads form the source topics 
> but it doesnt seem to make any difference : 
> [https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8305) AdminClient should support creating topics with default partitions and replication factor

2019-05-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839617#comment-16839617
 ] 

ASF GitHub Bot commented on KAFKA-8305:
---

agavra commented on pull request #6728: KAFKA-8305: support default partitions 
& replication factor in AdminClient#createTopic
URL: https://github.com/apache/kafka/pull/6728
 
 
   See: 
[KIP-464](https://cwiki.apache.org/confluence/display/KAFKA/KIP-464%3A+Defaults+for+AdminClient%23createTopic)
 for more information.
   
   ### Description
   
   This change makes the two required changes to support creating topics using 
the cluster defaults for replication and partitions:
   
   1. Adds a `NewTopic(String)` constructor to the `NewTopic` API
   2. Changes the `AdminManager` to accept `-1` as valid options for 
replication factor and partitions. If this is the case, it will resolve it 
using the default configuration.
   
   ### Testing
   
   - Updated unit tests with the new conditions
   - **TODO:** I still need to do end-to-end testing by spinning up my own 
Kafka cluster using this code and making sure that it works, but I'm putting 
the PR out early for review since the feature freeze is coming up soon
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> AdminClient should support creating topics with default partitions and 
> replication factor
> -
>
> Key: KAFKA-8305
> URL: https://issues.apache.org/jira/browse/KAFKA-8305
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Reporter: Almog Gavra
>Priority: Major
>
> Today, the AdminClient creates topics by requiring a `NewTopic` object, which 
> must contain either partitions and replicas or an exact broker mapping (which 
> then infers partitions and replicas). Some users, however, could benefit from 
> just using the cluster default for replication factor but may not want to use 
> auto topic creation.
> NOTE: I am planning on working on this, but I do not have permissions to 
> assign this ticket to myself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-6951) Implement offset expiration semantics for unsubscribed topics

2019-05-14 Thread Anna Povzner (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839605#comment-16839605
 ] 

Anna Povzner edited comment on KAFKA-6951 at 5/14/19 4:34 PM:
--

Hi [~vahid], KIP-211 wiki implies that KIP-211 is fully implemented and in 
2.1.0 release. Could you please add a reference to this JIRA to KIP-211 wiki, 
so that it is clear which part of KIP-211 is implemented and which is not? 


was (Author: apovzner):
Hi [~vahid], KIP-211 wiki implies that KIP-211 is fully implemented and in 
2.1.1 release. Could you please add a reference to this JIRA to KIP-211 wiki, 
so that it is clear which part of KIP-211 is implemented and which is not? 

> Implement offset expiration semantics for unsubscribed topics
> -
>
> Key: KAFKA-6951
> URL: https://issues.apache.org/jira/browse/KAFKA-6951
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Major
>
> [This 
> portion|https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets#KIP-211:ReviseExpirationSemanticsofConsumerGroupOffsets-UnsubscribingfromaTopic]
>  of KIP-211 will be implemented separately from the main PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6951) Implement offset expiration semantics for unsubscribed topics

2019-05-14 Thread Anna Povzner (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839605#comment-16839605
 ] 

Anna Povzner commented on KAFKA-6951:
-

Hi [~vahid], KIP-211 wiki implies that KIP-211 is fully implemented and in 
2.1.1 release. Could you please add a reference to this JIRA to KIP-211 wiki, 
so that it is clear which part of KIP-211 is implemented and which is not? 

> Implement offset expiration semantics for unsubscribed topics
> -
>
> Key: KAFKA-6951
> URL: https://issues.apache.org/jira/browse/KAFKA-6951
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Major
>
> [This 
> portion|https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets#KIP-211:ReviseExpirationSemanticsofConsumerGroupOffsets-UnsubscribingfromaTopic]
>  of KIP-211 will be implemented separately from the main PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8343) streams application crashed due to rocksdb

2019-05-14 Thread gaoshu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoshu updated KAFKA-8343:
--
Description: my streams application always crashed in few days.  The crash 
log looks like 
[https://github.com/facebook/rocksdb/issues/5234|[https://github.com/facebook/rocksdb/issues/5234].]
  so I think it may because of RocksDBStore.java closed incorrectly in 
multithread, or RockesIterator.key() was accessed after RocksDBstore.close() in 
some cases.  (was: my streams application always crashed in few days.  The 
crash log looks like 
[https://github.com/facebook/rocksdb/issues/5234|[https://github.com/facebook/rocksdb/issues/5234].]
  so I think it may because of RocksDBStore.java closed incorrectly in 
multithread, or RockesIterator.key() was accessed after RocksDBstore.close().)

> streams application crashed due to rocksdb
> --
>
> Key: KAFKA-8343
> URL: https://issues.apache.org/jira/browse/KAFKA-8343
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
> Environment: centos 7 jdk8 kafka-streams1.0
>Reporter: gaoshu
>Priority: Major
> Attachments: fullsizeoutput_6.jpeg
>
>
> my streams application always crashed in few days.  The crash log looks like 
> [https://github.com/facebook/rocksdb/issues/5234|[https://github.com/facebook/rocksdb/issues/5234].]
>   so I think it may because of RocksDBStore.java closed incorrectly in 
> multithread, or RockesIterator.key() was accessed after RocksDBstore.close() 
> in some cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8343) streams application crashed due to rocksdb

2019-05-14 Thread gaoshu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoshu updated KAFKA-8343:
--
Description: my streams application always crashed in few days.  The crash 
log looks like 
[https://github.com/facebook/rocksdb/issues/5234|[https://github.com/facebook/rocksdb/issues/5234].]
  so I think it may because of RocksDBStore.java closed incorrectly in 
multithread, or RockesIterator.key() was accessed after RocksDBstore.close().  
(was: my streams application always crashed in few days.  The crash log looks 
like 
[https://github.com/facebook/rocksdb/issues/5234|[https://github.com/facebook/rocksdb/issues/5234].]
  so I think it may because of RocksDBStore.java closed incorrectly in 
multithread.  I look through the below code,  it means the db.close()  should 
after openiterators.close(). However, db.close() may be executed before 
iterators.close() due to instructions reorder. I hope my guess is correct.
{code:java}
// RocksDBStore.java
@Override
public synchronized void close() {
if (!open) {
return;
}

open = false;
closeOpenIterators();
options.close();
wOptions.close();
fOptions.close();
db.close();

options = null;
wOptions = null;
fOptions = null;
db = null;
}
{code})

> streams application crashed due to rocksdb
> --
>
> Key: KAFKA-8343
> URL: https://issues.apache.org/jira/browse/KAFKA-8343
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
> Environment: centos 7 jdk8 kafka-streams1.0
>Reporter: gaoshu
>Priority: Major
> Attachments: fullsizeoutput_6.jpeg
>
>
> my streams application always crashed in few days.  The crash log looks like 
> [https://github.com/facebook/rocksdb/issues/5234|[https://github.com/facebook/rocksdb/issues/5234].]
>   so I think it may because of RocksDBStore.java closed incorrectly in 
> multithread, or RockesIterator.key() was accessed after RocksDBstore.close().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8364) Avoid decompression of record when validate record at server in the scene of inPlaceAssignment .

2019-05-14 Thread Flower.min (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flower.min reassigned KAFKA-8364:
-

Assignee: Flower.min

> Avoid decompression of record when validate record  at server in the scene of 
> inPlaceAssignment .
> -
>
> Key: KAFKA-8364
> URL: https://issues.apache.org/jira/browse/KAFKA-8364
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.2.0
>Reporter: Flower.min
>Assignee: Flower.min
>Priority: Major
>
> We do performance testing about Kafka server in specific scenarios .We build 
> a kafka cluster with one broker,and create topics with different number of 
> partitions.Then we start lots of producer processes to send large amounts of 
> messages to one of the topics at one  testing .And  we found that when the 
> upper limit of CPU usage has been reached  But  it does not reach the upper 
> limit of the bandwidth of the server  network(Network inflow 
> rate:600M/s;CPU(%):>97%). 
> We analysis the JFIR of Kafka server when doing performance testing .After we 
> checked and completed the performance test again, we located the code 
> *"*ByteBuffer recordBuffer = 
> ByteBuffer.allocate(sizeOfBodyInBytes);(Class:DefaultRecord,Function:readFrom())''which
>  consumed CPU resources and caused a lot of GC .So we remove the allocation 
> and copying of ByteBuffer at our modified code, the test performance is 
> greatly improved(Network inflow rate:1GB/s;CPU(%):<60%) .This issue already 
> have been raised and solved at {color:#33}*[KAFKA-8106]*{color}.
> *We also analysis the code of validation to  record at server. Currently the 
> broker will decompress whole record including 'key' and 'value' to validate 
> record timestamp, key, offset, uncompressed size bytes, and magic . We remove 
> the decompression operation and then do performance testing again . we found 
> the CPU's stable usage is below 30% even lower.* *Removing decompression 
> operation to record can minimize CPU usage and improve performance greatly.*
> Should we think of preventing decompress record  when validate record at 
> server in the scene of inPlaceAssignment ? 
> *We think we should optimize the process of server-side validation record  
> for achieving the purpose of verifying the message without decompressing the 
> message.* 
>  Maybe we can add some properties ('batch.min.timestamp'(Long) 
> ,'records.number'(Integer),'all.key.is.null'(boolean)) *in client side to the 
> batch level for validation*, *so that we don't need decompress record for 
> validate 'offset','timestamp' and key(The value of 'all.key.is.null' will 
> false when there is w key is null).*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8364) Avoid decompression of record when validate record at server in the scene of inPlaceAssignment .

2019-05-14 Thread Flower.min (JIRA)
Flower.min created KAFKA-8364:
-

 Summary: Avoid decompression of record when validate record  at 
server in the scene of inPlaceAssignment .
 Key: KAFKA-8364
 URL: https://issues.apache.org/jira/browse/KAFKA-8364
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.2.0
Reporter: Flower.min


We do performance testing about Kafka server in specific scenarios .We build a 
kafka cluster with one broker,and create topics with different number of 
partitions.Then we start lots of producer processes to send large amounts of 
messages to one of the topics at one  testing .And  we found that when the 
upper limit of CPU usage has been reached  But  it does not reach the upper 
limit of the bandwidth of the server  network(Network inflow 
rate:600M/s;CPU(%):>97%). 

We analysis the JFIR of Kafka server when doing performance testing .After we 
checked and completed the performance test again, we located the code 
*"*ByteBuffer recordBuffer = 
ByteBuffer.allocate(sizeOfBodyInBytes);(Class:DefaultRecord,Function:readFrom())''which
 consumed CPU resources and caused a lot of GC .So we remove the allocation and 
copying of ByteBuffer at our modified code, the test performance is greatly 
improved(Network inflow rate:1GB/s;CPU(%):<60%) .This issue already have been 
raised and solved at {color:#33}*[KAFKA-8106]*{color}.

*We also analysis the code of validation to  record at server. Currently the 
broker will decompress whole record including 'key' and 'value' to validate 
record timestamp, key, offset, uncompressed size bytes, and magic . We remove 
the decompression operation and then do performance testing again . we found 
the CPU's stable usage is below 30% even lower.* *Removing decompression 
operation to record can minimize CPU usage and improve performance greatly.*

Should we think of preventing decompress record  when validate record at server 
in the scene of inPlaceAssignment ? 

*We think we should optimize the process of server-side validation record  for 
achieving the purpose of verifying the message without decompressing the 
message.* 
 Maybe we can add some properties ('batch.min.timestamp'(Long) 
,'records.number'(Integer),'all.key.is.null'(boolean)) *in client side to the 
batch level for validation*, *so that we don't need decompress record for 
validate 'offset','timestamp' and key(The value of 'all.key.is.null' will false 
when there is w key is null).*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more apparent where one topic has 
records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 
The only way to increase the period of historically joined records is to 
increase the grace period for the join windows, and this has repercussions when 
you extend it to a large period e.g. 2 years of minute-by-minute records.



Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 
 Research on this issue has gone through a few phases :

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more apparent where one topic has 
records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 
Research on this issue has gone through a few phases :



1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more apparent where one topic has 
records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 
Research on this issue has gone through a few phases :



1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more apparent where one topic has 
records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository : 
 [https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[
 [https://github.com/the4thamigo-uk/join-example]]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63

 


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :


 [https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository : 
 [https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more apparent where one topic has 
records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :


 [https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
[https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63]

 


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more apparent where 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[
 [https://github.com/the4thamigo-uk/join-example]]

 

Related slack conversations : 

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics :

We have tried to set various options to suppress reads form the source topics 
but it doesnt seem to make any difference : 
https://github.com/the4thamigo-uk/join-example/commit/c674977fd0fdc689152695065d9277abea6bef63

 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[
https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

 [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics.



 


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a problem where one 
> topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> [
>  [https://github.com/the4thamigo-uk/join-example]]
>  
> 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[
https://github.com/the4thamigo-uk/join-example]

 

Related slack conversations : 

 [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
[https://github.com/apache/kafka/pull/6719]

 3) Further investigation using a crafted unit test seems to show that the 
partition-selection and ordering (PartitionGroup/RecordQueue) seems to work ok

[https://github.com/the4thamigo-uk/kafka/commit/5121851491f2fd0471d8f3c49940175e23a26f1b]

4) the current assumption is that the issue is rooted in the way records are 
consumed from the topics.



 

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
https://github.com/apache/kafka/pull/6719

 3) Further investigation using a couple of crafted unit tests 

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a problem where one 
> topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> [
> https://github.com/the4thamigo-uk/join-example]
>  
> Related slack conversations : 
>  [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1557733979453900
>  
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

[https://github.com/the4thamigo-uk/join-example]

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. This would only be a 
problem for data that is out-of-order, and join-example uses data that is in 
order of timestamp in both topics. So this fix is thought not to affect 
join-example.

This was considered to be an issue and is being addressed in 
https://github.com/apache/kafka/pull/6719

 3) Further investigation using a couple of crafted unit tests 

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

https://github.com/the4thamigo-uk/join-example

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. 

This was considered to be an issue and is being addressed in 

 

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a problem where one 
> topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> [https://github.com/the4thamigo-uk/join-example]
>  
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
> This was considered to be a problem with the documentation not with the API 
> and is addressed in [https://github.com/apache/kafka/pull/6664]
> 2) We then found an apparent issue in the code which would affect the 
> partition that is selected to deliver the next record to the join. This would 
> only be a problem for data that is out-of-order, and join-example uses data 
> that is in order of timestamp in both topics. So this fix is thought not to 
> affect 

[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
 An example of the issue is provided in this repository :

https://github.com/the4thamigo-uk/join-example

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2) We then found an apparent issue in the code which would affect the partition 
that is selected to deliver the next record to the join. 

This was considered to be an issue and is being addressed in 

 

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
An example of the issue is provided in this repository : 

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2)

 

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a problem where one 
> topic has records at less frequent record timestamp intervals that the other.
>  An example of the issue is provided in this repository :
> https://github.com/the4thamigo-uk/join-example
>  
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
> This was considered to be a problem with the documentation not with the API 
> and is addressed in [https://github.com/apache/kafka/pull/6664]
> 2) We then found an apparent issue in the code which would affect the 
> partition that is selected to deliver the next record to the join. 
> This was considered to be an issue and is being addressed in 
>  
>  
> Slack conversation here : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [Additional]
> From what I understand, the retention period should be independent of the 
> grace period, so I think this is more than a documentation fix (see comments 
> below)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.
An example of the issue is provided in this repository : 

 

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized: i.e.}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

This was considered to be a problem with the documentation not with the API and 
is addressed in [https://github.com/apache/kafka/pull/6664]

2)

 

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)

  was:
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a problem where one 
> topic has records at less frequent record timestamp intervals that the other.
> An example of the issue is provided in this repository : 
>  
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized: i.e.}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
> This was considered to be a problem with the documentation not with the API 
> and is addressed in [https://github.com/apache/kafka/pull/6664]
> 2)
>  
>  
> Slack conversation here : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [Additional]
> From what I understand, the retention period should be independent of the 
> grace period, so I think this is more than a documentation fix (see comments 
> below)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Description: 
The problem we are experiencing is that we cannot reliably perform simple joins 
over pre-populated kafka topics. This seems more of a problem where one topic 
has records at less frequent record timestamp intervals that the other.

1) This issue was initially thought to be due to the inability to set the 
retention period for a join window via {{Materialized}}

The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)

  was:
The documentation says to use `Materialized` not `JoinWindows.until()` 
([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
 but there is no where to pass a `Materialized` instance to the join operation, 
only to the group operation is supported it seems.

 

Slack conversation here : 
[https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]

[Additional]

>From what I understand, the retention period should be independent of the 
>grace period, so I think this is more than a documentation fix (see comments 
>below)


> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The problem we are experiencing is that we cannot reliably perform simple 
> joins over pre-populated kafka topics. This seems more of a problem where one 
> topic has records at less frequent record timestamp intervals that the other.
> 1) This issue was initially thought to be due to the inability to set the 
> retention period for a join window via {{Materialized}}
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
>  
> Slack conversation here : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [Additional]
> From what I understand, the retention period should be independent of the 
> grace period, so I think this is more than a documentation fix (see comments 
> below)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8315) Historical join issues

2019-05-14 Thread Andrew (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew updated KAFKA-8315:
--
Summary: Historical join issues  (was: Cannot pass Materialized into a join 
operation - hence cant set retention period independent of grace)

> Historical join issues
> --
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
>  
> Slack conversation here : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [Additional]
> From what I understand, the retention period should be independent of the 
> grace period, so I think this is more than a documentation fix (see comments 
> below)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)