Re: parquet-format status

2024-03-07 Thread Vinoo Ganesh
Hi Gabor - Thanks for providing that context! Given the variety of different implementations, it sounds like keeping the status quo is best for now. I'll publish the latest version of the website and will also look into including versioned links to the tagged parquet-format github repo there too.

Re: parquet-format status

2024-03-07 Thread Gábor Szádovszky
There is a big difference between the repos of Arrow, Avro, Iceberg etc. and Parquet. The mentioned projects have everything in one repo including the different language bindings etc. so it is natural to have the specs there as well and having universal releases. Meanwhile Parquet has different

Re: parquet-format status

2024-03-07 Thread Vinoo Ganesh
Hi Antoine - Perhaps my thoughts weren't clear - but I'm mostly pointing out a few things: 1. The parquet-format repo doesn't have much code other than the thrift definition 2. parquet operates fairly uniquely compared to other products in this space, that maintain doc versions either in the

Re: parquet-format status

2024-03-07 Thread Uwe L. Korn
I can strongly second Antoine's response here. It is a small but very import repository hold crucial information for the project.. Best Uwe On Thu, Mar 7, 2024, at 1:17 PM, Antoine Pitrou wrote: > Hello, > > I am surprised that this is suggesting to deprecate or delete a > repository just

Re: parquet-format status

2024-03-07 Thread Antoine Pitrou
Hello, I am surprised that this is suggesting to deprecate or delete a repository just because a website building procedure isn't properly setup to deal with it. ISTM the "right" solution would be for the Parquet website to automatically update its contents based on the latest released version

Re: parquet-format status

2024-03-06 Thread Jan Finis
You do want version control and a place to discuss spec changes for all spec documents so they need to be in *some* repo. The website is nice to have, but it should just be derived from documents stored in a repo. Whether that repo is parquet-format or parquet-mr isn't too significant. Having said

Re: parquet-format status

2024-03-05 Thread Vinoo Ganesh
Hi Gang, Thanks - the historical context definitely makes sense and I hear your concern about breaking existing links. One thing I observed though, is that this choice also makes Parquet a bit unique in this space. For example, Iceberg's Table spec (https://iceberg.apache.org/spec/) and

Re: parquet-format status

2024-03-05 Thread Gang Wu
Hi Vinoo, IMO, we cannot do this because the parquet-format repo serves as the dedicated place to hold the parquet specs, which includes the thrift definition file and a set of documents tagged for all versions. Some projects also directly reference the link of the markdown files, which will be

parquet-format status

2024-03-05 Thread Vinoo Ganesh
Hi Parquet Dev - There have been some conversations about content stored on the parquet-format github repo vs. the website. Doing a cursory pass of the parquet-format repo, it looks like, other than the markdown documentation stored in the repo, most of