Damian,

Thanks for the proposal, I had a few comments on the APIs:

1. Printed#withFile seems not needed, as users should always spec if it is
to sysOut or to File at the beginning. In addition as a second thought, I
think serdes are not useful for prints anyways since we assume `toString`
is provided except for byte arrays, in which we will special handle it.

Another comment about Printed in general is it differs with other options
that it is a required option than optional one, since it includes toSysOut
/ toFile specs; what are the pros and cons for including these two in the
option and hence make it a required option than leaving them at the API
layer and make Printed as optional for mapper / label only?


2.1 KStream#through / to

We should have an overloaded function without Produced?

2.2 KStream#groupBy / groupByKey

We should have an overloaded function without Serialized?

2.3 KGroupedStream#count / reduce / aggregate

We should have an overloaded function without Materialized?

2.4 KStream#join

We should have an overloaded function without Joined?


2.5 Each of KTable's operators:

We should have an overloaded function without Produced / Serialized /
Materialized?



3.1 Produced: the static functions have overlaps, which seems not
necessary. I'd suggest jut having the following three static with another
three similar member functions:

public static <K, V> Produced<K, V> withKeySerde(final Serde<K> keySerde)

public static <K, V> Produced<K, V> withValueSerde(final Serde<V>
valueSerde)

public static <K, V> Produced<K, V> withStreamPartitioner(final
StreamPartitioner<K, V> partitioner)

The key idea is that by using the same function name string for static
constructor and member functions, users do not need to remember what are
the differences but can call these functions with any ordering they want,
and later calls on the same spec will win over early calls.


3.2 Serialized: similarly

public static <K, V> Serialized<K, V> withKeySerde(final Serde<K> keySerde)

public static <K, V> Serialized<K, V> withValueSerde(final Serde<V>
valueSerde)

public Serialized<K, V> withKeySerde(final Serde<K> keySerde)

public Serialized<K, V> withValueSerde(final Serde valueSerde)

Also it has a final Serde<V> otherValueSerde in one of its static
constructor, it that intentional?

3.3. Joined: similarly, keep the static constructor signatures the same as
its corresponding member fields.

3.4 Materialized: it is a bit special, and I think we can keep its static
constructors with only two `as` as they are today.K


4. Is there any modifications on StateStoreSupplier? Is it replaced by
BytesStoreSupplier? Seems some more descriptions are lacking here. Also in

public static <K, V, S extends StateStore> Materialized<K, V, S>
as(final StateStoreSupplier<S>
supplier)

Is the parameter in type of BytesStoreSupplier?




Guozhang


On Thu, Jul 27, 2017 at 5:26 AM, Damian Guy <damian....@gmail.com> wrote:

> Updated link:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+
> use+of+custom+storage+engines
>
> Thanks,
> Damian
>
> On Thu, 27 Jul 2017 at 13:09 Damian Guy <damian....@gmail.com> wrote:
>
> > Hi,
> >
> > I've put together a KIP to make some changes to the KafkaStreams DSL that
> > will hopefully allow us to:
> > 1) reduce the explosion of overloads
> > 2) add new features without having to continue adding more overloads
> > 3) provide simpler ways for people to use custom storage engines and wrap
> > them with logging, caching etc if desired
> > 4) enable per-operator caching rather than global caching without having
> > to resort to supplying a StateStoreSupplier when you just want to turn
> > caching off.
> >
> > The KIP is here:
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=73631309
> >
> > Thanks,
> > Damian
> >
>



-- 
-- Guozhang

Reply via email to