Hi. I'm using the StreamingFileSink for writing partitioned data to s3. The code is below:
StreamingFileSink<GenericRecord> sink = StreamingFileSink.forBulkFormat(new Path("s3a://test-bucket/test"), ParquetAvroFactory.getParquetWriter(schema, "GZIP")) .withBucketAssigner(new PartitionBucketAssigner(partitionColumns)) .build(); How can i remove the partition columns from the data (or not populating them in the GenericRecord)? My problem is with AWS Glue crawler which creates duplicate columns in the table. Thanks, Yitzchak.