For info both AvroIO ReadAll/ParseAll and TextIO ReadAll deprecations
were merged into master today and will be part of 2.13.0.

For those working in other SDKs (Python, Go) please pay attention to
not implement such transforms (or deprecate them too if already done)
to keep the API ideas coherent.

On Wed, Feb 6, 2019 at 11:27 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
>
> +1
>
> Thanks for that Ismaël.
>
> Regards
> JB
>
> On 06/02/2019 11:24, Ismaël Mejía wrote:
> > Since it seems we have consensus on deprecating both transforms I created
> >
> > BEAM-6605 Deprecate TextIO.readAll() and TextIO.ReadAll transform
> > BEAM-6606 Deprecate AvroIO.readAll() and AvroIO.ReadAll transform
> >
> > Thanks everyone.
> >
> > On Fri, Feb 1, 2019 at 7:03 PM Chamikara Jayalath <chamik...@google.com> 
> > wrote:
> >>
> >> Python SDK doesn't have FileIO yet so let's keep ReadAllFromFoo transforms 
> >> currently available for various file types around till we have that.
> >>
> >> Thanks,
> >> Cham
> >>
> >> On Fri, Feb 1, 2019 at 7:41 AM Jean-Baptiste Onofré <j...@nanthrax.net> 
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> readFiles() should be used IMHO. We should remove readAll() to avoid
> >>> confusion.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 30/01/2019 17:25, Ismaël Mejía wrote:
> >>>> Hello,
> >>>>
> >>>> A ‘recent’ pattern of use in Beam is to have in file based IOs a
> >>>> `readAll()` implementation that basically matches a `PCollection` of
> >>>> file patterns and reads them, e.g. `TextIO`, `AvroIO`. `ReadAll` is
> >>>> implemented by a expand function that matches files with FileIO and
> >>>> then reads them using a format specific `ReadFiles` transform e.g.
> >>>> TextIO.ReadFiles, AvroIO.ReadFiles. So in the end `ReadAll` in the
> >>>> Java implementation is just an user friendly API to hide FileIO.match
> >>>> + ReadFiles.
> >>>>
> >>>> Most recent IOs do NOT implement ReadAll to encourage the more
> >>>> composable approach of File + ReadFiles, e.g. XmlIO and ParquetIO.
> >>>>
> >>>> Implementing ReadAll as a wrapper is relatively easy and is definitely
> >>>> user friendly, but it has an  issue, it may be error-prone and it adds
> >>>> more code to maintain (mostly ‘repeated’ code). However `readAll` is a
> >>>> more abstract pattern that applies not only to File based IOs so it
> >>>> makes sense for example in other transforms that map a `Pcollection`
> >>>> of read requests and is the basis for SDF composable style APIs like
> >>>> the recent `HBaseIO.readAll()`.
> >>>>
> >>>> So the question is should we:
> >>>>
> >>>> [1] Implement `readAll` in all file based IOs to be user friendly and
> >>>> assume the (minor) maintenance cost
> >>>>
> >>>> or
> >>>>
> >>>> [2] Deprecate `readAll` from file based IOs and encourage users to use
> >>>> FileIO + `readFiles` (less maintenance and encourage composition).
> >>>>
> >>>> I just checked quickly in the python code base but I did not find if
> >>>> the File match + ReadFiles pattern applies, but it would be nice to
> >>>> see what the python guys think on this too.
> >>>>
> >>>> This discussion comes from a recent slack conversation with Łukasz
> >>>> Gajowy, and we wanted to settle into one approach to make the IO
> >>>> signatures consistent, so any opinions/preferences?
> >>>>
> >>>
> >>> --
> >>> Jean-Baptiste Onofré
> >>> jbono...@apache.org
> >>> http://blog.nanthrax.net
> >>> Talend - http://www.talend.com
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com

Reply via email to