Re: Welcoming Peter Vary as a new committer!

2021-01-25 Thread Jungtaek Lim
Congratulations Peter! Well deserved! On Tue, Jan 26, 2021 at 3:40 AM Wing Yew Poon wrote: > Congratulations Peter! > > > On Mon, Jan 25, 2021 at 10:35 AM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> Congratulations! >> >> On Jan 25, 2021, at 12:34 PM, Jacques Nadeau >> wrote: >>

Re: S3 strong read-after-write consistency

2020-12-02 Thread Jungtaek Lim
What about S3FileIO implementation? I see some issue filed that even with Hive catalog working with S3 brings unexpected issues, and S3FileIO supposed to fix the issue (according to Ryan). Is it safe without S3FileIO to use Hive catalog + Hadoop API for S3 now? 2020년 12월 2일 (수) 오후 6:54,

Re: [ANNOUNCE] Apache Iceberg release 0.10.0

2020-11-16 Thread Jungtaek Lim
Thanks everyone for the huge efforts on achieving the release! On Tue, Nov 17, 2020 at 9:09 AM Anton Okolnychyi wrote: > I am pleased to announce the release of Apache Iceberg 0.10.0! > > Apache Iceberg is an open table format for huge analytic datasets. Iceberg > delivers high query

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-02 Thread Jungtaek Lim
Probably not a good thread to ask, but encounter the issue again during verification of RC so asking here: I'm consistently encountering multiple test failures due to HMS. It shouldn't matter as others verified the UTs, but if someone is aware of the issue and the resolution (or at least where to

Re: Welcoming Jingsong Lee as a new committer

2020-10-10 Thread Jungtaek Lim
Congrats! 2020년 10월 10일 (토) 오후 3:56, Junjie Chen 님이 작성: > Congratulations! Thanks for your great contribution in Flink sink and > source! > > On Sat, Oct 10, 2020 at 9:10 AM 张军 wrote: > >> >> Congratulations >> >> JunZhang >> zhangjunem...@126.com >> >>

Re: Welcoming Zheng Hu as a new committer

2020-10-10 Thread Jungtaek Lim
Congrats! 2020년 10월 10일 (토) 오후 3:56, Junjie Chen 님이 작성: > Congratulations! Thanks for your great contribution in Flink sink and > source! > > On Sat, Oct 10, 2020 at 9:09 AM 张军 wrote: > >> >> Congratulations >> >> JunZhang >> zhangjunem...@126.com >> >>

Re: Impact on Spark-Iceberg usage on missing to enforce clustering/sort requirement (SPARK-23889)

2020-09-21 Thread Jungtaek Lim
t; > On Wed, Sep 16, 2020 at 4:27 PM Jungtaek Lim > wrote: > >> Hi all, >> >> Recently I played around the partitioned Iceberg table in Spark, and >> realized it requires manual sort. I had to google to find a workaround - I >> guess there's no

Impact on Spark-Iceberg usage on missing to enforce clustering/sort requirement (SPARK-23889)

2020-09-16 Thread Jungtaek Lim
this correctly? I feel we may need to spend efforts to push forward SPARK-23889 for Iceberg (or consider moving down to DSv1 writer), as I think the workaround is unacceptable for many end users. And probably need to document the impact and workaround till we fix the issue. Thanks, Jungtaek Lim (HeartSaVioR

Re: Question about Iceberg release cadence

2020-08-26 Thread Jungtaek Lim
y would like to see the case also covered by Iceberg. > I see there're lots of works in progress on the milestone (and these are > great features which should be done), but after this we cover both batch > and streaming workloads being done with Spark, which is a huge step forward > on

Re: [DISCUSS] Rename iceberg-hive module?

2020-08-19 Thread Jungtaek Lim
+1 for `iceberg-hive-metastore` and also +1 for RD's proposal. Thanks, Jungtaek Lim (HeartSaVioR) On Thu, Aug 20, 2020 at 11:20 AM Jingsong Li wrote: > +1 for `iceberg-hive-metastore` > > I'm confused about `iceberg-hive` and `iceberg-mr`. > > Best, > Jingsong > > O

Re: [VOTE] Release Apache Iceberg 0.9.1 RC0

2020-08-19 Thread Jungtaek Lim
Just FYI, looks like the 0.9.1 artifacts are available now, but the release page on the website hasn't been updated yet. On Sat, Aug 15, 2020 at 9:46 AM Ryan Blue wrote: > With 8 +1 votes and no others, this RC passes. Thanks for validating the > patch release, everyone! > > I'll get started on

Re: [DISCUSS] 0.9.1 release

2020-08-01 Thread Jungtaek Lim
n doing the remainder of the refactor in > master? > > On Fri, Jul 31, 2020 at 5:29 PM Jungtaek Lim > wrote: > >> If we still have some more days I think #1280 >> <https://github.com/apache/iceberg/pull/1280>: "fix serialization issue >> in BaseCombinedScanTask

Re: [DISCUSS] 0.9.1 release

2020-07-31 Thread Jungtaek Lim
If we still have some more days I think #1280 : "fix serialization issue in BaseCombinedScanTask with Kyro" is a good candidate to be included. The bug affects both Spark and Flink (according to #1279 ). On

Re: Effect of enabling 'write.metadata.delete-after-commit.enabled'

2020-07-29 Thread Jungtaek Lim
t if you want to look up data by >> a dimension other than time -- for example, using the bucket of an ID -- >> then the natural clustering doesn't work well. In that case, you can use >> RewriteManifests or the RewriteManifestsAction to cluster data files by >> some key. That really

Re: Effect of enabling 'write.metadata.delete-after-commit.enabled'

2020-07-28 Thread Jungtaek Lim
1 PM Jungtaek Lim wrote: > I'd love to contribute documentation about the actions - just need some > time to understand the needs for some actions (like RewriteManifestAction). > > I just submitted a PR for structured streaming sink [1]. I mentioned > expireSnapshot() there with linking jav

Re: Effect of enabling 'write.metadata.delete-after-commit.enabled'

2020-07-27 Thread Jungtaek Lim
just need someone to write up docs and > contribute them. We don't use the streaming sink, so I've unfortunately > overlooked it. > > On Mon, Jul 27, 2020 at 3:25 PM Jungtaek Lim > wrote: > >> Thanks for the quick response! >> >> And yes I also went through

Re: Effect of enabling 'write.metadata.delete-after-commit.enabled'

2020-07-27 Thread Jungtaek Lim
o prune > old table versions -- although expiring snapshots will remove them from > table metadata and limit how far back you can time travel. > > On Mon, Jul 27, 2020 at 4:33 AM Jungtaek Lim > wrote: > >> Hi devs, >> >> I'm experimenting with Apache Iceberg for Stru

Effect of enabling 'write.metadata.delete-after-commit.enabled'

2020-07-27 Thread Jungtaek Lim
ffect time-travel (as it refers to a snapshot), and restoring is also from snapshot, so not sure which point to consider when turning on the option. Thanks, Jungtaek Lim