[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-09-22 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-696383217







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-09-22 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-696565610


   test this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-09-21 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-696383217


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-09-21 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-696383217


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-09-17 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-694203001


   @mjsax - ready for another review.
   
   - boolean `onlyIfMaterialized` is now flipped to `forceMaterialization`.
   - `KTableFilter` constructor initialized internal `sendOldValues` flag based 
on parent tables state. (Not perfect, but works).
   - Removed test that was testing bad behaviour.
   
   The only outstanding piece, as I see it, is the bad behaviour if you call 
`enableSendingOldValues` without forcing materialization on a table that is 
itself materialized, but who's upstream is not.  In such a situation, the table 
will _not_ enable sending old values. 
   
   A similar bug existed before this PR:  if you called 
`enableSendingOldValues` on a table that is itself materialized, but who's 
upstream is not, then it will force materialization on its upstream table, 
which is unnecessary.
   
   With the new code, both of the bad behaviours are true, i.e. the change 
introduces the first one and the second one is still happening.  
   
   I've raised https://issues.apache.org/jira/browse/KAFKA-10494 to track this.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-09-02 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-685493474


   @guozhangwang @mjsax ready for another review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] big-andy-coates commented on pull request #9156: KAFKA-10077: Filter downstream of state-store results in spurious tombstones

2020-08-10 Thread GitBox


big-andy-coates commented on pull request #9156:
URL: https://github.com/apache/kafka/pull/9156#issuecomment-671501927


   @mjsax as discussed. Please review.
   
   I think this change makes the output from the table filter semantically 
correct, i.e. we no longer output tombstones for rows that didn't exist in the 
output to begin with. However, this comes at a cost!  Now the source table is 
being materialized, (as you can see from the changes needed to get some other 
tests to pass).
   
   The cost of better semantics could be very high, and the user has no way of 
avoiding this. Where as previously the user could choose to 'fix' the bad 
semantics by manually calling `enableSendingOldValues` themselves, if they 
cared. I'm left feeling a little uneasy about _forcing_ users to pay the cost 
of materialization, even if they either don't care about the spurious 
tombstones, or their use case doesn't generate them.
   
   This leads me to the following questions:
   
   1. Wouldn't this be a breaking change for existing users of the library? If 
we stick with this solution, how would we handle this?
   1. Might it be better to only enable the sending of old values _if_ the 
source table is already materialized? This would mean the fix only pays the 
cost of an additional rocksdb read, which is still not zero, but much lower 
that forced materialization, and it would also mean this isn't a breaking 
change.
   
   Or maybe we choose to _not_ fix this. Preferring the current semantically 
incorrect, but better performing, solution with a known workaround for users 
that require correct semantics?  i.e. we could document the use of 
`enableSendingOldValues` in the `filter` method's java docs.
   
   Your thoughts my good man?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org