Hi Eugene,thanks for the update. I'm volunteer to tackle some those IOs (and make them conform with PTransform style guide). I'm pretty sure other people will jump on ;)
Regards JB On 04/08/2017 12:20 AM, Eugene Kirpichov wrote:
Hey all, More progress has been made and we're nearing completion. ParDo, BigQueryIO and Window are fixed; Map/FlatMapElements are in review. The remaining unclaimed ones are all IOs of some form, and here's a list. I've marked them all as "starter" in JIRA. XML - https://issues.apache.org/jira/browse/BEAM-1914 TFRecordIO (Tensorflow) - https://issues.apache.org/jira/browse/BEAM-1913 KinesisIO - https://issues.apache.org/jira/browse/BEAM-1428 PubsubIO - https://issues.apache.org/jira/browse/BEAM-1415 CountingInput - https://issues.apache.org/jira/browse/BEAM-1414 https://github.com/apache/beam/pull/2149 , which fixes BigQueryIO, is a good model to follow when taking these on, as well as e.g. https://github.com/apache/beam/pull/1927 (TextIO) These are all actually easy to fix, but need volunteers (I do not have time to fix all of these myself, but happy to be a reviewer - @jkff). Let's finish this up in time for the first Beam stable release, so Beam's stable API surface is consistent and polished! On Fri, Mar 3, 2017 at 12:00 PM Eugene Kirpichov <kirpic...@google.com> wrote:It sounds like a great idea in general, though there are actually not that many issues remaining - I fixed a bunch, and have PRs out for two more (ParDo and BigQueryIO). The remaining ones are: [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1355 <https://issues.apache.org/jira/browse/BEAM-1355> HDFS IO should comply with PTransform style guide - [image: Major - Major loss of function.] - OPEN [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1402 <https://issues.apache.org/jira/browse/BEAM-1402> Make TextIO and AvroIO use best-practice types. - [image: Major - Major loss of function.] - OPEN [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1414 <https://issues.apache.org/jira/browse/BEAM-1414> CountingInput should comply with PTransform style guide - [image: Major - Major loss of function.] - OPEN [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1415 <https://issues.apache.org/jira/browse/BEAM-1415> PubsubIO should comply with PTransfrom style guide - [image: Major - Major loss of function.] - OPEN [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1418 <https://issues.apache.org/jira/browse/BEAM-1418> MapElements and FlatMapElements should comply with PTransform style guide - [image: Major - Major loss of function.] - OPEN [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1425 <https://issues.apache.org/jira/browse/BEAM-1425> Window should comply with PTransform style guide - [image: Major - Major loss of function.] - OPEN [image: Bug - A problem which impairs or prevents the functions of the product.] BEAM-1428 <https://issues.apache.org/jira/browse/BEAM-1428> KinesisIO should comply with PTransform style guide - [image: Major - Major loss of function.] - OPEN HDFS is pending the other mailing list thread. For TextIO/AvroIO there's a pending PR by Reuven. Window and Map/FlatMapElements could use a brief discussion on the mailing list as to what the proper API should be. KinesisIO is, I think, due for a rewrite. However, CountingInput and PubsubIO I think are great targets for people to pick up. Any volunteers? On Fri, Mar 3, 2017 at 11:30 AM Robert Bradshaw <rober...@google.com.invalid> wrote: Here's a crazy idea: what if we had a virtual fixit/hackathon to knock these out (similar to the virtual meet-up, but with an agenda)? I find communal hacking sessions towards a common goal are a good way to get to know each other and get a lot done. Would there be any interest in this? On Wed, Mar 1, 2017 at 11:30 AM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote:Hey all, First couple rounds of fixes are in. Thanks Aviem Zur for contributing TextIO fixes and Dan Halperin for reviewing! One more fix by Reuven in progress (https://github.com/apache/beam/pull/1927). Follow https://issues.apache.org/jira/browse/BEAM-1353 and sub-issuesforthe status. Many of the changes are backwards-incompatible (though with only minor changes in pipelines required). I'll make a separate announcement about that on the users list now. Call for help with the other sub-issues still stands! They are pretty simple work items but pretty important since this is a blocker for declaring stable BEAM API. On Wed, Feb 8, 2017 at 3:51 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: Thanks Eugene. I will tackle some Jira when back next week. Regards JB On Feb 7, 2017, 18:16, at 18:16, Eugene Kirpichov <kirpic...@google.com.INVALID> wrote:Hey all, I bit the bullet and audited all PTransform classes in Beam Java SDK and filed JIRA issues for all violations I could find. I linked all them to the master JIRA issue https://issues.apache.org/jira/browse/BEAM-1353 In general, all of these should be fixed before declaring Beam stable API. Would appreciate if some senior folks looked at the issues and confirmed that my suggested changes make sense. PRs very welcome :) (though I'll be gone for the next few weeks so I can't review right now) Many of these are very easy to fix (a few lines of code); some require a little more code, but as far as I can tell all of them are mechanical. On Tue, Jan 31, 2017 at 4:10 PM Eugene Kirpichov <kirpic...@google.com> wrote:On Mon, Jan 30, 2017 at 7:56 PM Dan Halperin<dhalp...@google.com.invalid>wrote: On Mon, Jan 30, 2017 at 5:42 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote:Hello, The PTransform Style Guide is live https://beam.apache.org/contribute/ptransform-style-guide/ - anaturalnext step is to audit Beam libraries for compliance and file JIRAs forplacesthat need to be fixed. It'd be great to finish these cleanupsbeforedeclaring Beam stable API. Please take a look and file JIRAs / post suggestions on thisthread!I think it'll also make a great source of easy and useful work fornewcontributors. Some things I remember off the top of my head: - TextIO, KafkaIO use coders improperly - coders should not be usedas ageneral-purpose byte parsing mechanism.Can you say more about Kafka? Kafka actually exports byte[] bydefault,whereas Text files are String by default. So it does not seem nearlyasegregious for Kafka as it is for Text. Agreed that KafkaIO is less egregious, but it still has methods withKeyCoder and withValueCoder - these should be replaced withsomethingthat doesn't take Coder. - HadoopFileSource is not packaged as a PTransform- Some connectors, e.g. KafkaIO, should use AutoValue for theirparameterbuilders, but don'tIsn't AutoValue entirely an internal implementation detail that isnotexposed(*) to users? I think this is irrelevant to a stable API. Agreed - doesn't block stable API, but still a good thing to dobecause itmakes the code cleaner (for KafkaIO there's a long-standing PR thatwasblocked on ratifying the style guide https://github.com/apache/beam/pull/1048) (*) except that it makes transforms not able to be final, which is a regression. I think AutoValue use should generally be considered *very* optional.Intransforms I author, I prefer not to use AutoValue because it makesthecode more complex and less readable. Yeah, guidance on when to use / not use AutoValue could be improved.Ithink it makes a lot of sense when the transform has more than one ortwoparameters or when the set of parameters can grow.- A few connectors improperly use - Some transforms expose their transform type as "Something.Bound"and"Something.Unbound", e.g. TextIO.Read.Bound - such names are banned"banned" is a strong word to use here. All of these are just recommendations. In general yes; the goal of the style guide is to be the default,where ifyou deviate from it, you should have a good reason. I don't thinkthereever exists a good reason to name a transform Something.Bound/Unbound though.I filed an umbrella JIRAhttps://issues.apache.org/jira/browse/BEAM-1353about making existing Beam transforms comply with the guide - let'scrowdsourcethis! Thanks.
-- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com