Re: [DISCUSS] Parquet blog

2026-01-28 Thread Aihua Xu
Hi community,

I just drafted a variant blog
.
When you have time, can you help review?

Thanks,
Aihua

On Fri, Jan 9, 2026 at 4:33 AM Andrew Lamb  wrote:

> The idea of blog posts came up the other day too, so I filed some tickets
> to hopefully track and organize the effort and help collaboration
>
> I think there are at least three proposed blogs in this thread, so I filed
> three tickets:
>
> - Variant[1]
> - Geospatial[2]
> - Complex Types[3]
>
> Looking forward to seeing these get posted. Let me know how I can help
> (e.g. figuring out how to actually post them on the site, for example).
>
> Andrew
>
> [1]: https://github.com/apache/parquet-site/issues/147
> [2]: https://github.com/apache/parquet-site/issues/148
> [3]: https://github.com/apache/parquet-site/issues/149
>
> On Thu, Oct 2, 2025 at 8:33 PM Aihua Xu  wrote:
>
> > Agree. We started to write a little bit for variant and I can take
> variant
> > blog.
> > >
> > > On Oct 2, 2025, at 5:07 PM, Julien Le Dem  wrote:
> > >
> > > I was thinking that we should talk more about the new features in
> > Parquet.
> > > Specifically, write blog posts about Variant or Geo types.
> > > How they came to be, what they enable, how they are integrated in the
> > > ecosystem.
> > > The Parquet blog is only the release announcements for now (
> > > https://parquet.apache.org/blog/) and we can make it an opportunity to
> > > spotlight some of the recent additions to the project.
> > > Contributors welcome.
> > > Some sources of inspiration:
> > > https://arrow.apache.org/blog/
> > > https://openlineage.io/blog
> > > https://datafusion.apache.org/blog/
> > > What do you think?
> >
>


Re: [DISCUSS] Parquet blog

2026-01-09 Thread Andrew Lamb
The idea of blog posts came up the other day too, so I filed some tickets
to hopefully track and organize the effort and help collaboration

I think there are at least three proposed blogs in this thread, so I filed
three tickets:

- Variant[1]
- Geospatial[2]
- Complex Types[3]

Looking forward to seeing these get posted. Let me know how I can help
(e.g. figuring out how to actually post them on the site, for example).

Andrew

[1]: https://github.com/apache/parquet-site/issues/147
[2]: https://github.com/apache/parquet-site/issues/148
[3]: https://github.com/apache/parquet-site/issues/149

On Thu, Oct 2, 2025 at 8:33 PM Aihua Xu  wrote:

> Agree. We started to write a little bit for variant and I can take variant
> blog.
> >
> > On Oct 2, 2025, at 5:07 PM, Julien Le Dem  wrote:
> >
> > I was thinking that we should talk more about the new features in
> Parquet.
> > Specifically, write blog posts about Variant or Geo types.
> > How they came to be, what they enable, how they are integrated in the
> > ecosystem.
> > The Parquet blog is only the release announcements for now (
> > https://parquet.apache.org/blog/) and we can make it an opportunity to
> > spotlight some of the recent additions to the project.
> > Contributors welcome.
> > Some sources of inspiration:
> > https://arrow.apache.org/blog/
> > https://openlineage.io/blog
> > https://datafusion.apache.org/blog/
> > What do you think?
>


Re: [DISCUSS] Parquet blog

2025-10-18 Thread Arnav Balyan
Just noticed Julien mentioned having one doc per topic. To clarify, this
one is for Nested (Complex) Data Types specifically .I’ve renamed the doc
to focus only on Nested (Complex) Data types so it’s scoped properly.
Link (same access):
https://docs.google.com/document/d/1cx61Fp82HHB8TvePO4bFIbX1EXvv9Sy8Wx_YnyKeeW8/edit?tab=t.0

Please feel free to comment or suggest edits. Thanks!

- Arnav

On Wed, Oct 8, 2025 at 5:45 AM Arnav Balyan  wrote:

> Thanks Jia and Dewey!
>
> I wasn’t sure if we planned to have separate docs or a shared one, I just
> put up this one to get things started. Please feel free to add your content
> or make any changes as you see fit. Thanks!
>
> – Arnav
>
> On Wed, Oct 8, 2025 at 5:38 AM Arnav Balyan 
> wrote:
>
>> Just sharing this link so we can collaborate together!
>>
>> https://docs.google.com/document/d/1cx61Fp82HHB8TvePO4bFIbX1EXvv9Sy8Wx_YnyKeeW8/edit?tab=t.0
>>
>> It's a crude empty doc, please feel free to edit. It should have edit
>> access for all thanks!
>>
>> Arnav
>>
>> On Wed, Oct 8, 2025 at 5:31 AM Jia Yu  wrote:
>>
>>> Thanks, Julien.
>>>
>>> I will start a google doc about the Parquet Geo type blog! I will also
>>> ping Dewey for some help.
>>>
>>> Jia
>>>
>>> On Tue, Oct 7, 2025 at 4:57 PM Julien Le Dem  wrote:
>>> >
>>> > All we need is someone to start a draft on a google doc and share it
>>> on the
>>> > list to start collaborating :)
>>> > (for each article)
>>> > Thank you everyone who's chiming in.
>>> >
>>> > On Sun, Oct 5, 2025 at 3:20 AM Arnav Balyan 
>>> wrote:
>>> >
>>> > >
>>> > > Hi Julien and all,
>>> > >
>>> > > +1 - I’d be happy to contribute a short write-up on how Parquet
>>> handles
>>> > > complex nested data types.
>>> > >
>>> > > - Arnav
>>> > >
>>> > >
>>> > > On 2025/10/02 23:43:52 Julien Le Dem wrote:
>>> > > > I was thinking that we should talk more about the new features in
>>> > > Parquet.
>>> > > > Specifically, write blog posts about Variant or Geo types.
>>> > > > How they came to be, what they enable, how they are integrated in
>>> the
>>> > > > ecosystem.
>>> > > > The Parquet blog is only the release announcements for now (
>>> > > > https://parquet.apache.org/blog/) and we can make it an
>>> opportunity to
>>> > > > spotlight some of the recent additions to the project.
>>> > > > Contributors welcome.
>>> > > > Some sources of inspiration:
>>> > > > https://arrow.apache.org/blog/
>>> > > > https://openlineage.io/blog
>>> > > > https://datafusion.apache.org/blog/
>>> > > > What do you think?
>>> > >
>>>
>>


Re: [DISCUSS] Parquet blog

2025-10-18 Thread Julien Le Dem
All we need is someone to start a draft on a google doc and share it on the
list to start collaborating :)
(for each article)
Thank you everyone who's chiming in.

On Sun, Oct 5, 2025 at 3:20 AM Arnav Balyan  wrote:

>
> Hi Julien and all,
>
> +1 - I’d be happy to contribute a short write-up on how Parquet handles
> complex nested data types.
>
> - Arnav
>
>
> On 2025/10/02 23:43:52 Julien Le Dem wrote:
> > I was thinking that we should talk more about the new features in
> Parquet.
> > Specifically, write blog posts about Variant or Geo types.
> > How they came to be, what they enable, how they are integrated in the
> > ecosystem.
> > The Parquet blog is only the release announcements for now (
> > https://parquet.apache.org/blog/) and we can make it an opportunity to
> > spotlight some of the recent additions to the project.
> > Contributors welcome.
> > Some sources of inspiration:
> > https://arrow.apache.org/blog/
> > https://openlineage.io/blog
> > https://datafusion.apache.org/blog/
> > What do you think?
>


Re: [DISCUSS] Parquet blog

2025-10-18 Thread Arnav Balyan
Just sharing this link so we can collaborate together!
https://docs.google.com/document/d/1cx61Fp82HHB8TvePO4bFIbX1EXvv9Sy8Wx_YnyKeeW8/edit?tab=t.0

It's a crude empty doc, please feel free to edit. It should have edit
access for all thanks!

Arnav

On Wed, Oct 8, 2025 at 5:31 AM Jia Yu  wrote:

> Thanks, Julien.
>
> I will start a google doc about the Parquet Geo type blog! I will also
> ping Dewey for some help.
>
> Jia
>
> On Tue, Oct 7, 2025 at 4:57 PM Julien Le Dem  wrote:
> >
> > All we need is someone to start a draft on a google doc and share it on
> the
> > list to start collaborating :)
> > (for each article)
> > Thank you everyone who's chiming in.
> >
> > On Sun, Oct 5, 2025 at 3:20 AM Arnav Balyan 
> wrote:
> >
> > >
> > > Hi Julien and all,
> > >
> > > +1 - I’d be happy to contribute a short write-up on how Parquet handles
> > > complex nested data types.
> > >
> > > - Arnav
> > >
> > >
> > > On 2025/10/02 23:43:52 Julien Le Dem wrote:
> > > > I was thinking that we should talk more about the new features in
> > > Parquet.
> > > > Specifically, write blog posts about Variant or Geo types.
> > > > How they came to be, what they enable, how they are integrated in the
> > > > ecosystem.
> > > > The Parquet blog is only the release announcements for now (
> > > > https://parquet.apache.org/blog/) and we can make it an opportunity
> to
> > > > spotlight some of the recent additions to the project.
> > > > Contributors welcome.
> > > > Some sources of inspiration:
> > > > https://arrow.apache.org/blog/
> > > > https://openlineage.io/blog
> > > > https://datafusion.apache.org/blog/
> > > > What do you think?
> > >
>


Re: [DISCUSS] Parquet blog

2025-10-18 Thread Jia Yu
Thanks, Julien.

I will start a google doc about the Parquet Geo type blog! I will also
ping Dewey for some help.

Jia

On Tue, Oct 7, 2025 at 4:57 PM Julien Le Dem  wrote:
>
> All we need is someone to start a draft on a google doc and share it on the
> list to start collaborating :)
> (for each article)
> Thank you everyone who's chiming in.
>
> On Sun, Oct 5, 2025 at 3:20 AM Arnav Balyan  wrote:
>
> >
> > Hi Julien and all,
> >
> > +1 - I’d be happy to contribute a short write-up on how Parquet handles
> > complex nested data types.
> >
> > - Arnav
> >
> >
> > On 2025/10/02 23:43:52 Julien Le Dem wrote:
> > > I was thinking that we should talk more about the new features in
> > Parquet.
> > > Specifically, write blog posts about Variant or Geo types.
> > > How they came to be, what they enable, how they are integrated in the
> > > ecosystem.
> > > The Parquet blog is only the release announcements for now (
> > > https://parquet.apache.org/blog/) and we can make it an opportunity to
> > > spotlight some of the recent additions to the project.
> > > Contributors welcome.
> > > Some sources of inspiration:
> > > https://arrow.apache.org/blog/
> > > https://openlineage.io/blog
> > > https://datafusion.apache.org/blog/
> > > What do you think?
> >


RE: [DISCUSS] Parquet blog

2025-10-17 Thread Arnav Balyan


Hi Julien and all,

+1 - I’d be happy to contribute a short write-up on how Parquet handles complex 
nested data types.

- Arnav


On 2025/10/02 23:43:52 Julien Le Dem wrote:
> I was thinking that we should talk more about the new features in Parquet.
> Specifically, write blog posts about Variant or Geo types.
> How they came to be, what they enable, how they are integrated in the
> ecosystem.
> The Parquet blog is only the release announcements for now (
> https://parquet.apache.org/blog/) and we can make it an opportunity to
> spotlight some of the recent additions to the project.
> Contributors welcome.
> Some sources of inspiration:
> https://arrow.apache.org/blog/
> https://openlineage.io/blog
> https://datafusion.apache.org/blog/
> What do you think?


Re: [DISCUSS] Parquet blog

2025-10-07 Thread Arnav Balyan
Thanks Jia and Dewey!

I wasn’t sure if we planned to have separate docs or a shared one, I just
put up this one to get things started. Please feel free to add your content
or make any changes as you see fit. Thanks!

– Arnav

On Wed, Oct 8, 2025 at 5:38 AM Arnav Balyan  wrote:

> Just sharing this link so we can collaborate together!
>
> https://docs.google.com/document/d/1cx61Fp82HHB8TvePO4bFIbX1EXvv9Sy8Wx_YnyKeeW8/edit?tab=t.0
>
> It's a crude empty doc, please feel free to edit. It should have edit
> access for all thanks!
>
> Arnav
>
> On Wed, Oct 8, 2025 at 5:31 AM Jia Yu  wrote:
>
>> Thanks, Julien.
>>
>> I will start a google doc about the Parquet Geo type blog! I will also
>> ping Dewey for some help.
>>
>> Jia
>>
>> On Tue, Oct 7, 2025 at 4:57 PM Julien Le Dem  wrote:
>> >
>> > All we need is someone to start a draft on a google doc and share it on
>> the
>> > list to start collaborating :)
>> > (for each article)
>> > Thank you everyone who's chiming in.
>> >
>> > On Sun, Oct 5, 2025 at 3:20 AM Arnav Balyan 
>> wrote:
>> >
>> > >
>> > > Hi Julien and all,
>> > >
>> > > +1 - I’d be happy to contribute a short write-up on how Parquet
>> handles
>> > > complex nested data types.
>> > >
>> > > - Arnav
>> > >
>> > >
>> > > On 2025/10/02 23:43:52 Julien Le Dem wrote:
>> > > > I was thinking that we should talk more about the new features in
>> > > Parquet.
>> > > > Specifically, write blog posts about Variant or Geo types.
>> > > > How they came to be, what they enable, how they are integrated in
>> the
>> > > > ecosystem.
>> > > > The Parquet blog is only the release announcements for now (
>> > > > https://parquet.apache.org/blog/) and we can make it an
>> opportunity to
>> > > > spotlight some of the recent additions to the project.
>> > > > Contributors welcome.
>> > > > Some sources of inspiration:
>> > > > https://arrow.apache.org/blog/
>> > > > https://openlineage.io/blog
>> > > > https://datafusion.apache.org/blog/
>> > > > What do you think?
>> > >
>>
>


Re: [DISCUSS] Parquet blog

2025-10-03 Thread Julien Le Dem
Sounds great!

On Fri, Oct 3, 2025 at 6:21 AM Andrew Lamb  wrote:

> I think writing up some of the recent progress in Parquet is a great idea
> and I think will help the project to continue to grow and evolve.  I am
> happy to collaborate on any of the topics listed so far.
>
> Another post that might be interesting is be something like this:
>
> "The future of Parquet: Calling for Ideas" that makes a broad, public call
> for people to come join the conversation and help us understand how they
> would like to see the format evolve.  We could also highlight the recent
> documentation of the proposal structure / process.
>
> Andrew
>
>
>
> On Thu, Oct 2, 2025 at 10:30 PM Julien Le Dem  wrote:
>
> > I think the best way would be to coordinate on the list in case more than
> > one person wants to contribute on the same topic.
> > If that happens, you could either collaborate on a blog post with
> multiple
> > authors or have two posts with different angles.
> > We can use google docs as working documents shared on the list where
> people
> > can contribute suggestions or comments.
> > We can even use github issues for blog posts we'd like to see.
> >
> >
> > On Thu, Oct 2, 2025 at 6:13 PM Jia Yu  wrote:
> >
> > > Great idea!
> > >
> > > Happy to write a blog post about the geo data type!
> > >
> > > Thanks,
> > > Jia Yu
> > >
> > > On Thu, Oct 2, 2025 at 17:45 Aihua Xu  wrote:
> > >
> > > > Agree. We started to write a little bit for variant and I can take
> > > variant
> > > > blog.
> > > > >
> > > > > On Oct 2, 2025, at 5:07 PM, Julien Le Dem 
> wrote:
> > > > >
> > > > > I was thinking that we should talk more about the new features in
> > > > Parquet.
> > > > > Specifically, write blog posts about Variant or Geo types.
> > > > > How they came to be, what they enable, how they are integrated in
> the
> > > > > ecosystem.
> > > > > The Parquet blog is only the release announcements for now (
> > > > > https://parquet.apache.org/blog/) and we can make it an
> opportunity
> > to
> > > > > spotlight some of the recent additions to the project.
> > > > > Contributors welcome.
> > > > > Some sources of inspiration:
> > > > > https://arrow.apache.org/blog/
> > > > > https://openlineage.io/blog
> > > > > https://datafusion.apache.org/blog/
> > > > > What do you think?
> > > >
> > >
> >
>


Re: [DISCUSS] Parquet blog

2025-10-03 Thread Andrew Lamb
I think writing up some of the recent progress in Parquet is a great idea
and I think will help the project to continue to grow and evolve.  I am
happy to collaborate on any of the topics listed so far.

Another post that might be interesting is be something like this:

"The future of Parquet: Calling for Ideas" that makes a broad, public call
for people to come join the conversation and help us understand how they
would like to see the format evolve.  We could also highlight the recent
documentation of the proposal structure / process.

Andrew



On Thu, Oct 2, 2025 at 10:30 PM Julien Le Dem  wrote:

> I think the best way would be to coordinate on the list in case more than
> one person wants to contribute on the same topic.
> If that happens, you could either collaborate on a blog post with multiple
> authors or have two posts with different angles.
> We can use google docs as working documents shared on the list where people
> can contribute suggestions or comments.
> We can even use github issues for blog posts we'd like to see.
>
>
> On Thu, Oct 2, 2025 at 6:13 PM Jia Yu  wrote:
>
> > Great idea!
> >
> > Happy to write a blog post about the geo data type!
> >
> > Thanks,
> > Jia Yu
> >
> > On Thu, Oct 2, 2025 at 17:45 Aihua Xu  wrote:
> >
> > > Agree. We started to write a little bit for variant and I can take
> > variant
> > > blog.
> > > >
> > > > On Oct 2, 2025, at 5:07 PM, Julien Le Dem  wrote:
> > > >
> > > > I was thinking that we should talk more about the new features in
> > > Parquet.
> > > > Specifically, write blog posts about Variant or Geo types.
> > > > How they came to be, what they enable, how they are integrated in the
> > > > ecosystem.
> > > > The Parquet blog is only the release announcements for now (
> > > > https://parquet.apache.org/blog/) and we can make it an opportunity
> to
> > > > spotlight some of the recent additions to the project.
> > > > Contributors welcome.
> > > > Some sources of inspiration:
> > > > https://arrow.apache.org/blog/
> > > > https://openlineage.io/blog
> > > > https://datafusion.apache.org/blog/
> > > > What do you think?
> > >
> >
>


Re: [DISCUSS] Parquet blog

2025-10-02 Thread Julien Le Dem
I think the best way would be to coordinate on the list in case more than
one person wants to contribute on the same topic.
If that happens, you could either collaborate on a blog post with multiple
authors or have two posts with different angles.
We can use google docs as working documents shared on the list where people
can contribute suggestions or comments.
We can even use github issues for blog posts we'd like to see.


On Thu, Oct 2, 2025 at 6:13 PM Jia Yu  wrote:

> Great idea!
>
> Happy to write a blog post about the geo data type!
>
> Thanks,
> Jia Yu
>
> On Thu, Oct 2, 2025 at 17:45 Aihua Xu  wrote:
>
> > Agree. We started to write a little bit for variant and I can take
> variant
> > blog.
> > >
> > > On Oct 2, 2025, at 5:07 PM, Julien Le Dem  wrote:
> > >
> > > I was thinking that we should talk more about the new features in
> > Parquet.
> > > Specifically, write blog posts about Variant or Geo types.
> > > How they came to be, what they enable, how they are integrated in the
> > > ecosystem.
> > > The Parquet blog is only the release announcements for now (
> > > https://parquet.apache.org/blog/) and we can make it an opportunity to
> > > spotlight some of the recent additions to the project.
> > > Contributors welcome.
> > > Some sources of inspiration:
> > > https://arrow.apache.org/blog/
> > > https://openlineage.io/blog
> > > https://datafusion.apache.org/blog/
> > > What do you think?
> >
>


Re: [DISCUSS] Parquet blog

2025-10-02 Thread Jia Yu
Great idea!

Happy to write a blog post about the geo data type!

Thanks,
Jia Yu

On Thu, Oct 2, 2025 at 17:45 Aihua Xu  wrote:

> Agree. We started to write a little bit for variant and I can take variant
> blog.
> >
> > On Oct 2, 2025, at 5:07 PM, Julien Le Dem  wrote:
> >
> > I was thinking that we should talk more about the new features in
> Parquet.
> > Specifically, write blog posts about Variant or Geo types.
> > How they came to be, what they enable, how they are integrated in the
> > ecosystem.
> > The Parquet blog is only the release announcements for now (
> > https://parquet.apache.org/blog/) and we can make it an opportunity to
> > spotlight some of the recent additions to the project.
> > Contributors welcome.
> > Some sources of inspiration:
> > https://arrow.apache.org/blog/
> > https://openlineage.io/blog
> > https://datafusion.apache.org/blog/
> > What do you think?
>


Re: [DISCUSS] Parquet blog

2025-10-02 Thread Aihua Xu
Agree. We started to write a little bit for variant and I can take variant blog.
> 
> On Oct 2, 2025, at 5:07 PM, Julien Le Dem  wrote:
> 
> I was thinking that we should talk more about the new features in Parquet.
> Specifically, write blog posts about Variant or Geo types.
> How they came to be, what they enable, how they are integrated in the
> ecosystem.
> The Parquet blog is only the release announcements for now (
> https://parquet.apache.org/blog/) and we can make it an opportunity to
> spotlight some of the recent additions to the project.
> Contributors welcome.
> Some sources of inspiration:
> https://arrow.apache.org/blog/
> https://openlineage.io/blog
> https://datafusion.apache.org/blog/
> What do you think?