Hello, Recently we introduced a style guide for developing PTransforms https://beam.apache.org/contribute/ptransform-style-guide/ and a natural consequence of that was making Beam itself comply with its own API best practices.
That work is tracked in JIRA https://issues.apache.org/jira/browse/BEAM-1353 and many of the changes are (trivially) backward-incompatible; we're trying to finish this work sooner, before declaring Beam stable API. The changes so far are: - TextIO.Read now always returns a PCollection<String> and does *not* take .withCoder() to parse the strings. Instead, parse the strings by applying a ParDo or MapElements to the collection. - Likewise, TextIO.Write now always takes a PCollection<String>, and to write something else to TextIO, convert it to String using a ParDo or MapElements. - Class Write.Bound is now simply Write. This matters only if you were extracting applications of Write.to(Sink) into a variable - its type used to be Write.Bound<...>, now it'll be Write<...>. - Likewise, classes Flatten.FlattenIterables and Flatten.FlattenPCollectionList are renamed respectively to Flatten.Iterables and Flatten.PCollections. - GroupByKey.create(boolean fewKeys) is now simply GroupByKey.create() and GroupByKey.createWithFewKeys(). - Classes Count.PerElement, PerKey, Globally are now private so you have to use the factory functions such as Count.perElement() (whereas previously you could use "new Count.PerElement()"). Additionally, if you want to e.g. use .withHotKeyFanout() or whatnot, then you can't do that directly on a result of .apply(Count.perElement() etc) anymore - instead Count exposes its combine function as Count.combineFn() and you should apply Combine.globally(Count.combineFn()) yourself. - Same as for Count, applies to Latest and Sample - ToString.of() renamed to ToString.elements(), kv to kvs, iterable to iterables. - BufferedExternalSorter.Options setter methods are renamed from setBlah to withBlah. - In KafkaIO, you now have to specify type parameters explicitly: e.g. KafkaIO.<Foo, Bar>read(). More changes like this will follow. Please follow the JIRA item for updates.
