Hi Ashutosh, > An extension just for COPY to/from parquet looks limited in > functionality. Shouldn't this be viewed as an FDW or Table AM support > for parquet or other formats? Of course the later is much larger in > scope compared to the first one. But there may already be efforts > underway > https://www.postgresql.org/about/news/parquet-s3-fdw-01-was-newly-released-2179/
Many thanks for sharing your thoughts on this! We are using parquet_fdw [2] but this is a read-only FDW. What users typically need is to dump their data as fast as possible in a given format and either to upload it to the cloud as historical data or to transfer it to another system (Spark, etc). The data can be accessed later if needed, as read only one. Note that when accessing the historical data with parquet_fdw you basically have a zero ingestion time. Another possible use case is transferring data to PostgreSQL from another source. Here the requirements are similar - the data should be dumped as fast as possible from the source, transferred over the network and imported as fast as possible. In other words, personally I'm unaware of use cases when somebody needs a complete read/write FDW or TableAM implementation for formats like Parquet, ORC, etc. Also to my knowledge they are not particularly optimized for this. [2]: https://github.com/adjust/parquet_fdw -- Best regards, Aleksander Alekseev