Hi,

readFiles() should be used IMHO. We should remove readAll() to avoid
confusion.

Regards
JB

On 30/01/2019 17:25, Ismaël Mejía wrote:
> Hello,
> 
> A ‘recent’ pattern of use in Beam is to have in file based IOs a
> `readAll()` implementation that basically matches a `PCollection` of
> file patterns and reads them, e.g. `TextIO`, `AvroIO`. `ReadAll` is
> implemented by a expand function that matches files with FileIO and
> then reads them using a format specific `ReadFiles` transform e.g.
> TextIO.ReadFiles, AvroIO.ReadFiles. So in the end `ReadAll` in the
> Java implementation is just an user friendly API to hide FileIO.match
> + ReadFiles.
> 
> Most recent IOs do NOT implement ReadAll to encourage the more
> composable approach of File + ReadFiles, e.g. XmlIO and ParquetIO.
> 
> Implementing ReadAll as a wrapper is relatively easy and is definitely
> user friendly, but it has an  issue, it may be error-prone and it adds
> more code to maintain (mostly ‘repeated’ code). However `readAll` is a
> more abstract pattern that applies not only to File based IOs so it
> makes sense for example in other transforms that map a `Pcollection`
> of read requests and is the basis for SDF composable style APIs like
> the recent `HBaseIO.readAll()`.
> 
> So the question is should we:
> 
> [1] Implement `readAll` in all file based IOs to be user friendly and
> assume the (minor) maintenance cost
> 
> or
> 
> [2] Deprecate `readAll` from file based IOs and encourage users to use
> FileIO + `readFiles` (less maintenance and encourage composition).
> 
> I just checked quickly in the python code base but I did not find if
> the File match + ReadFiles pattern applies, but it would be nice to
> see what the python guys think on this too.
> 
> This discussion comes from a recent slack conversation with Łukasz
> Gajowy, and we wanted to settle into one approach to make the IO
> signatures consistent, so any opinions/preferences?
> 

-- 
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to