Hello Henrique,

I am not aware of existing Beam transforms specifically used for reading in
XLSX data. Can you share what you mean by "examples related with Cs
extension"?

I am aware of some Python libraries foir this sort of thing[1]. You could
use the FileIO transforms in the Python SDK to find each file, and then
write a DoFn that is able to read in data from these files. Check out this
unit test using FileIO to read CSV files[2].

Let me know if that helps, or if I went on the wrong direction of what you
needed.
Best
-P.

[1] https://openpyxl.readthedocs.io/en/stable/
[2]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/fileio_test.py#L128-L148

On Mon, Apr 15, 2019 at 12:47 PM Henrique Molina <[email protected]>
wrote:

> Hello
>
> I would like to use best practices from Apache Beams to read Xlsx. however
> I found examples only related with Cs extension.
> someone there is sample using ParDo to Collect all columns and sheets from
> Excel xlsx ?
> Afterwards I will put into google Big query.
>
> Thanks & Regards
>
>

Reply via email to