+1 for adding the variant spec to Parquet. I'm looking forward to working
on the addition of shredding.
As for the details, I think I also prefer a separate repository,
`parquet-variant`, but I don't think we necessarily need to determine that
question up front.
On Tue, Sep 10, 2024 at 9:05 AM Ga
To me, what matters the most is not really the repository, but the release
process. Since the variant code is going to be fairly rapidly developed and
may not have a stable API, I'd prefer to have it on a separate release
cycle and start the versioning at 0.1.0 to avoid a misconception that the
API
n Wed, Sep 11, 2024 at 8:53 AM Gang Wu <
> > ustcwg-re5jqeeqqe8avxtiumw...@public.gmane.org> wrote:
> > > > > > >
> > > > > > > > Let's just vote for the adoption in this thread and discuss
> > the
&
Hopefully I can help because I wrote those rules.
I think that the correct type is List>. Because none of the
first 3 rules apply, the element type is the repeated type, which is a
repeated int32.
The rules are primarily trying to account for cases where known structures
were used. If the repeate
+1
People that need support can still use older versions, so I don't think
that this would be a significant problem for anyone.
On Wed, Feb 5, 2025 at 8:20 AM Fokko Driesprong wrote:
> Hi everyone,
>
> I would like to discuss the deprecation/removal of parquet-pig.
>
> The last Pig release
+1 (binding)
Thanks to everyone that worked on getting this update done! It's been an
amazing amount of discussion and I'm excited to see it ready to go.
On Thu, Feb 6, 2025 at 8:11 AM Jia Yu wrote:
> +1 (non-binding)
>
> I’m really looking forward to this! It’s going to be a fantastic addition
I think that Parquet should exactly reproduce the data that is written to
files, rather than either allowing or requiring Parquet implementations to
normalize types. To me, that's a fundamental guarantee of the storage
layer. The compute layer can decide to normalize types and take actions to
make
> > implementations to shred data by taking the schema of a field the first
> > > time it appears as a reasonable heuristic? More generally it might be
> > good
> > > to start discussing what API changes we expect are needed to support
> > > shredding in re
Iceberg has conflicting meetings. There is an Iceberg community sync at 9
AM PT every 3 weeks, with the next one on 19 March. There is also an
Iceberg REST catalog sync every 3 weeks at 9 AM PT one week after each
general community sync, so the next is on 26 March.
On Wed, Mar 5, 2025 at 1:02 PM J
The Variant in the thrift definition is a struct, so we can easily add
version later. The only reason to add it now is if we want to be able to
break forward compatibility with shredding. I'd be fine adding an
encoding/shredding version = 1.
On Thu, Feb 20, 2025 at 4:49 PM Micah Kornfield
wrote:
10 matches
Mail list logo