Re: export to parquet

2020-08-26 Thread George Woodring
I don't know how many hoops you want to jump through, we use AWS and Athena
to create them.

   - Export table as JSON
   - Put on AWS S3
   - Create JSON table in Athena
   - Use the JSON table to create a parquet table

The parquet files will be in S3 as well after the parquet table is
created.  If you are interested I can share the AWS CLI commands we use.

George Woodring
iGLASS Networks
www.iglass.net


On Wed, Aug 26, 2020 at 3:00 PM Scott Ribe 
wrote:

> I have no Hadoop, no HDFS. Just looking for the easiest way to export some
> PG tables into Parquet format for testing--need to determine what kind of
> space reduction we can get before deciding whether to look into it more.
>
> Any suggestions on particular tools? (PG 12, Linux)
>
>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> https://www.linkedin.com/in/scottribe/
>
>
>
>
>
>


Re: export to parquet

2020-08-26 Thread Scott Ribe
> On Aug 26, 2020, at 1:11 PM, Chris Travers  wrote:
> 
> For simple exporting, the simplest thing is a single-node instance of Spark.

Thanks.

> You can read parquet files in Postgres using 
> https://github.com/adjust/parquet_fdw if you so desire but it does not 
> support writing as parquet files are basically immutable.

Yep, that's the next step. Well, really it is what I am interested in testing, 
but first I need my data in parquet format (and confirmation that it gets 
decently compressed).



Re: export to parquet

2020-08-26 Thread Chris Travers
On Wed, Aug 26, 2020 at 9:00 PM Scott Ribe 
wrote:

> I have no Hadoop, no HDFS. Just looking for the easiest way to export some
> PG tables into Parquet format for testing--need to determine what kind of
> space reduction we can get before deciding whether to look into it more.
>
> Any suggestions on particular tools? (PG 12, Linux)
>
> For simple exporting, the simplest thing is a single-node instance of
Spark.

You can read parquet files in Postgres using
https://github.com/adjust/parquet_fdw if you so desire but it does not
support writing as parquet files are basically immutable.


>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> https://www.linkedin.com/in/scottribe/
>
>
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


export to parquet

2020-08-26 Thread Scott Ribe
I have no Hadoop, no HDFS. Just looking for the easiest way to export some PG 
tables into Parquet format for testing--need to determine what kind of space 
reduction we can get before deciding whether to look into it more.

Any suggestions on particular tools? (PG 12, Linux)


--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/