alamb opened a new issue, #140:
URL: https://github.com/apache/parquet-site/issues/140

   We recently added Variant to parquet format -- see 
https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#encoding-types
   
   However, the only documentation that currently exists is the low level 
technical spec. The higher level Parquet documentation does not contain 
anything about Variant: https://parquet.apache.org/docs/
   
   This is inconvenient as I try and discuss adding Variant support to various 
systems:  there is no high level overview / link to point people at .
   
   I would like to have a high level summary page in parquet.apache.org that:
   1. Explains the usecase of Variant (semi-structured data)
   2. Gives a technical overview of the encoding (with diagrams0\)
   3. Explains how shredding works, gives some examples (with diagrams)
   
   
   Some existing material to use:
   1. [slides from Accelerating Apache Parquet with metadata stores and 
specialized indexes using Apache DataFusion 
](https://docs.google.com/presentation/d/1e_Z_F8nt2rcvlNvhU11khF5lzJJVqNtqtyJ-G3mp4-Q),
  [recording ](https://www.youtube.com/watch?v=74YsJT1-Rdk)YouTube
   2. Original databricks announcement: 
https://www.databricks.com/blog/introducing-open-variant-data-type-delta-lake-and-apache-spark
   3. DataBricks announcement: 
https://www.databricks.com/blog/introducing-variant-new-open-standard-semi-structured-data-apache-parquettm-delta-lake
 (this focuses quite a bit on system-level integration of variant in DataBricks 
and reads more like product feature announcement rather than)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to