Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-13 Thread huaxin gao
Congratulations, everyone! On Tue, Aug 13, 2024 at 1:53 PM Ryan Blue wrote: > Congratulations! Thanks for all your contributions! > > On Tue, Aug 13, 2024 at 1:48 PM Steve Zhang > wrote: > >> Congrats everyone, well deserved! >> >> Thanks, >> Steve Zhang >> >> >> >> On Aug 13, 2024, at 1:31 PM,

Re: [DISCUSS] Implementing a table-level statistics file to store column statistics

2024-08-06 Thread huaxin gao
ata present >>> for data files be possible? >>> To me, it seems like doing some amount of derivation at query time is >>> okay, as long as the time it takes to do the derivation doesn't increase >>> significantly as the table gets larger. >>> &g

Re: [DISCUSS] Implementing a table-level statistics file to store column statistics

2024-08-02 Thread huaxin gao
in, max, and null > counts. > > Best, > Piotr > > > > On Fri, 2 Aug 2024 at 20:47, Samrose Ahmed wrote: > >> Isn't this addressed by the partition statistics feature, or do you want >> to have one row for the entire table? >> >> On Fri, Aug 2,

[DISCUSS] Implementing a table-level statistics file to store column statistics

2024-08-02 Thread huaxin gao
I would like to initiate a discussion on implementing a table-level statistics file to store column statistics, specifically min, max, and null counts. The original discussion can be found in this Slack thread: https://apache-iceberg.slack.com/archives/C03LG1D563F/p1676395480005779. In Spark 3.4,

Re: Dropping JDK 8 support

2024-07-23 Thread huaxin gao
t in Iceberg 2.0 release". > It's fine for people to propose dropping JDK8 support sooner than that > (and I'm not against that), but the proposal being voted on should not be > switched mid-vote. > - Wing Yew > > > On Tue, Jul 23, 2024 at 10:45 PM huaxin gao > wr

Re: Dropping JDK 8 support

2024-07-23 Thread huaxin gao
in 1.6+ versions, which can be another > thread. > > On Wed, Jul 24, 2024 at 10:45 AM huaxin gao > wrote: > >> Hi Manu, >> Thanks for the discussion. Is your concern about customers who use JDK 8 >> with Spark 3.5? But we will face the same problem if we dr

Re: Dropping JDK 8 support

2024-07-23 Thread huaxin gao
Hi Manu, Thanks for the discussion. Is your concern about customers who use JDK 8 with Spark 3.5? But we will face the same problem if we drop JDK 8 in Iceberg 2.0, unless we plan to drop Spark 3.5 support in 2.0. Huaxin On Tue, Jul 23, 2024 at 7:30 PM Renjie Liu wrote: > Hi, Manu: > > > If we

Re: Dropping JDK 8 support

2024-07-23 Thread huaxin gao
harder because we're trying to get more things in a release. > Putting out a major release just for breaking API changes makes the most > sense to me. > > On Tue, Jul 23, 2024 at 9:50 AM Russell Spitzer > wrote: > >> +1 >> >> On Tue, Jul 23, 2024 at 11:4

Re: Dropping JDK 8 support

2024-07-23 Thread huaxin gao
ix-with-github-actions > > -Jack > > On Tue, Jul 23, 2024 at 9:15 AM huaxin gao wrote: > >> It seems my earlier question might have been overlooked. Could we clarify >> if JDK 8 support is being dropped in the next version? The proposal >> indicated for Iceberg 2

Re: Dropping JDK 8 support

2024-07-23 Thread huaxin gao
>>>> >>>>> On Tue, Jul 23, 2024 at 9:40 AM Szehon Ho >>>>> wrote: >>>>> >>>>>> +1 for dropping JDK 8 in Iceberg 2.0. I also wonder the same thing >>>>>> as Huaxin (sorry if I missed a previous thread on Iceb

Re: Dropping JDK 8 support

2024-07-22 Thread huaxin gao
+1 (non-binding) I have a question about iceberg versioning. After the 1.6 release, will there be versions 1.7, 1.8 and 1.9, or will it go straight to 2.0? On Mon, Jul 22, 2024 at 5:32 PM Manu Zhang wrote: > If JDK 8 support is dropped in 2.0, will we continue to fix critical > issues in 1.6+?

Re: Building with JDK 21

2024-07-19 Thread huaxin gao
+1 in favor of adding java 21 support +1 in favor of removing java 8 support I am currently working on Spark 4.0 / Iceberg integration . Spark 4.0 runs on Java 17/21. On Fri, Jul 19, 2024 at 4:58 AM Piotr Findeisen wrote: > Hi, > > We recently start

Re: [VOTE] Fix property names in REST spec for statistics / partition statistics

2024-07-09 Thread huaxin gao
+1 On Tue, Jul 9, 2024 at 10:50 PM Driesprong, Fokko wrote: > +1 (binding) > > Op wo 10 jul 2024 om 07:47 schreef Renjie Liu > >> +1 (non binding) >> >> On Wed, Jul 10, 2024 at 1:45 PM Daniel Weeks wrote: >> >>> +1 (binding) >>> >>> On Tue, Jul 9, 2024, 8:35 PM Eduard Tudenhöfner < >>> etudenh

Re: Making the NDV property required for theta sketch blobs in Puffin

2024-06-21 Thread huaxin gao
+1 for making the ndv blob metadata property required for theta sketches. On Fri, Jun 21, 2024 at 2:54 PM Amogh Jahagirdar <2am...@gmail.com> wrote: > Hey all, > > I wanted to raise this thread to discuss a spec change proposal > for making the ndv b

Re: Dynamically Support Spark Native Engine in Iceberg

2024-02-18 Thread huaxin gao
this. >> >> Cell : 425-233-8271 >> >> >> On Tue, Feb 13, 2024 at 4:38 PM huaxin gao >> wrote: >> >>> Hello Iceberg community, >>> >>> As you may already know, Project Comet >>> <https://github.com/apache/arrow-datafusion-

Dynamically Support Spark Native Engine in Iceberg

2024-02-13 Thread huaxin gao
Hello Iceberg community, As you may already know, Project Comet , a plugin to accelerate Spark query execution via leveraging DataFusion and Arrow, has been open sourced under the Apache Arrow umbrella. To capitalize on the capabilities of Project

Re: In Remembrance of Kyle

2022-12-05 Thread huaxin gao
I am extremely shocked and saddened to hear of Kyle's passing. When I made my very first Iceberg PR last August, Kyle reviewed it immediately and helped me on it, and he did so for almost all my PRs. I pulled out a couple of my old PRs just now and re-read his comments. He liked to put smiling fa