Hi Pablo ,
Thanks for your attention,
I so sorry, my bad written "Cs extension " I did means .csv extension !
The example like this: load-csv-file-from-google-cloud-storage
<https://kontext.tech/docs/DataAndBusinessIntelligence/p/load-csv-file-from-google-cloud-storage-to-bigquery-using-dataflow>

I was think Using apache POI to read each row from sheet  throwing to next
ParDo an CellRow rows
same like that:
.apply("xlsxToMap", ParDo.of(new DoFn<CellRow, Map<String,String>() {.....

I don't know if it is more ellegant...

If your have some Idea ! let me know . it will be welcome!!


On Mon, Apr 15, 2019 at 6:01 PM Pablo Estrada <[email protected]> wrote:

> Hello Henrique,
>
> I am not aware of existing Beam transforms specifically used for reading
> in XLSX data. Can you share what you mean by "examples related with Cs
> extension"?
>
> I am aware of some Python libraries foir this sort of thing[1]. You could
> use the FileIO transforms in the Python SDK to find each file, and then
> write a DoFn that is able to read in data from these files. Check out this
> unit test using FileIO to read CSV files[2].
>
> Let me know if that helps, or if I went on the wrong direction of what you
> needed.
> Best
> -P.
>
> [1] https://openpyxl.readthedocs.io/en/stable/
> [2]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/fileio_test.py#L128-L148
>
> On Mon, Apr 15, 2019 at 12:47 PM Henrique Molina <
> [email protected]> wrote:
>
>> Hello
>>
>> I would like to use best practices from Apache Beams to read Xlsx.
>> however I found examples only related with Cs extension.
>> someone there is sample using ParDo to Collect all columns and sheets
>> from Excel xlsx ?
>> Afterwards I will put into google Big query.
>>
>> Thanks & Regards
>>
>>
>

Reply via email to