Hello Team,
I'm glad that I would be connecting with people who are helping in
'pyarrow' usages.
my question is below snippet creating datafiles in s3 with partition keys
however partition columns are not part of datafiles hence is there any way
we can add the partition column in the datafiles so that customer query the
datafile will see the data for all columns + partition column.

kindly help me. onthis.


"targetKey = self.s3BucketName + '/' + outputDirectory + targetDirectory[:-1]

log.debug("Single or Multiple Partition columns being passed.",partitionCols
,[partitionCols], targetKey)
pq.write_to_dataset(table=table, root_path=targetKey, row_group_size=self.
chunkSizeLimit, partition_cols=[partitionCols], filesystem=s3, compression=
'snappy',partition_filename_cb=lambda x:'-'.join(str(x))+'.parquet')

-- 
Regards,
Mahesha S
Cell:8015127140

Reply via email to