You can use a split operator, generating 2 streams. Darshan Singh <darshan.m...@gmail.com> 于 2018年3月30日周五 上午2:53写道:
> Hi > > I have a dataset which has almost 99% of correct data. As of now if say > some data is bad I just ignore it and log it and return only correct data. > I do this inside a map function. > > The part which decides whether data is correct or not is expensive one. > > Now I want to store the bad data somewhere so that I could analyze that > data in future. > > So I can run the same calc 2 times and get the correct data in first go > and bad data in 2nd go. > > Is there a better way where I can somehow store the bad data from inside > of map function like send to kafka, file etc? > > Also, is there a way I could create a datastream which can get the data > from inside map function(not sure this is feasible as of now)? > > Thanks >