You will need to provide more info. Does the data contain records? Are the records "homogenous" ; ie; do they have the same fields? What is the format of the data? Are records separated by lines/seperators? Is the data sharded across multiple files? How big is each shard?
On 2/8/22, 11:50 AM, "Danilo Sousa" <danilosousa...@gmail.com> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi I have to transform unstructured text to dataframe. Could anyone please help with Scala code ? Dataframe need as: operadora filial unidade contrato empresa plano codigo_beneficiario nome_beneficiario Relação de Beneficiários Ativos e Excluídos Carteira em#27/12/2019##Todos os Beneficiários Operadora#AMIL Filial#SÃO PAULO#Unidade#Guarulhos Contrato#123456 - Test Empresa#Test Plano#Código Beneficiário#Nome Beneficiário 58693 - NACIONAL R COPART PJCE#073930312#Joao Silva 58693 - NACIONAL R COPART PJCE#073930313#Maria Silva Contrato#898011000 - FUNDACAO GERDAU Empresa#FUNDACAO GERDAU Plano#Código Beneficiário#Nome Beneficiário 58693 - NACIONAL R COPART PJCE#065751353#Jose Silva 58693 - NACIONAL R COPART PJCE#065751388#Joana Silva 58693 - NACIONAL R COPART PJCE#065751353#Felipe Silva 58693 - NACIONAL R COPART PJCE#065751388#Julia Silva --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org