You can use a split operator, generating 2 streams.

Darshan Singh <darshan.m...@gmail.com> 于 2018年3月30日周五 上午2:53写道:

> Hi
>
> I have a dataset which has almost 99% of correct data. As of now if say
> some data is bad I just ignore it and log it and return only correct data.
> I do this inside a map function.
>
> The part which decides whether data is correct or not is expensive one.
>
> Now I want to store the bad data somewhere so that I could analyze that
> data in future.
>
> So I can run the same calc 2 times and get the correct data in first go
> and bad data in 2nd go.
>
> Is there a better way where I can somehow store the bad data from inside
> of map function like send to kafka, file etc?
>
> Also, is there a way I could create a datastream which can get the data
> from inside map function(not sure this is feasible as of now)?
>
> Thanks
>

Reply via email to