a new contributor

2018-10-19 Thread Heejong Lee
Hi, I just wanted to introduce myself as a new contributor. I'm a new member of Apache Beam team at Google and will be working on IO modules. Happy to meet you all! Thanks, Heejong

[PROPOSAL] ParquetIO support for Python SDK

2018-10-24 Thread Heejong Lee
Hi, I'm working on BEAM-: Parquet IO for Python SDK. Issue: https://issues.apache.org/jira/browse/BEAM- Design doc: https://docs.google.com/document/d/1-FT6zmjYhYFWXL8aDM5mNeiUnZdKnnB021zTo4S-0Wg WIP PR: https://github.com/apache/beam/pull/6763 Any feedback is appreciated. Thanks!

Re: [PROPOSAL] ParquetIO support for Python SDK

2018-10-30 Thread Heejong Lee
8 at 4:45 PM Ahmet Altay wrote: > >> Thank you Heejong. Could you also share a summary of the design document >> (major points/decisions) in the mailing list? >> >> On Wed, Oct 24, 2018 at 4:08 PM, Heejong Lee wrote: >> >>> Hi, >>> >>

Re: [PROPOSAL] ParquetIO support for Python SDK

2018-11-13 Thread Heejong Lee
o base this on > byte sizes; will this be in v1 or will there be other parameter(s) > that we'll have to support going forward? > On Tue, Oct 30, 2018 at 10:42 PM Heejong Lee wrote: > > > > Thanks all for the valuable feedback on the document. Here's the summary > of planned fea

[PROPOSAL] decrease the number of threads for BigQuery streaming insertAll

2019-01-16 Thread Heejong Lee
Hi, I want to suggest the change[1] of the thread pool type in BigQuery streaming insert for Java SDK (BEAM-6443). When we insert small data into BigQuery very fast by using BigQueryIO.write, it generates lots of rate limit exceeded errors in a log file. It's mainly because the number of threads

Re: How to use "PortableRunner" in Python SDK?

2019-01-22 Thread Heejong Lee
You can also try without --streaming option. There's a separate streaming wordcount example in the same directory. If you want to look into the output files, it would be easier to use external target like gs:// instead of local file. python -m apache_beam.examples.wordcount --input=/etc/profile

Re: Add code quality checks to pre-commits.

2019-01-03 Thread Heejong Lee
>>> >>> We have never really used Sonarqube. It was turned on as a possibility >>> in the early days but never worked on past that point. Could be nice. I >>> suspect there's a lot to be gained by just finding very low numbers and >>> improving the

Re: Add code quality checks to pre-commits.

2019-01-03 Thread Heejong Lee
I don't have any experience of using SonarQube but Coverity worked well for me. Looks like it already has beam repo: https://scan.coverity.com/projects/11881 On Thu, Jan 3, 2019 at 1:27 PM Reuven Lax wrote: > checkstyle and findbugs are already run as precommit checks, are they not? > > On Thu,

Re: [DISCUSS] change the encoding scheme of Python StrUtf8Coder

2019-04-04 Thread Heejong Lee
; >> On Wed, Apr 3, 2019 at 6:21 PM Robert Burke >> wrote: >> >>> >> >>> String UTF8 was recently added as a "standard coder " URN in the >> protos, but I don't think that developed beyond Java, so adding it to >> Python would be reasonable

Re: [DISCUSS] change the encoding scheme of Python StrUtf8Coder

2019-04-04 Thread Heejong Lee
beam/coders/coders.py#L321 > >>>>>>> > >>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> We should define the spec clearly and have cross-la

Re: [ANNOUNCE] New PMC Member: Pablo Estrada

2019-05-15 Thread Heejong Lee
Congratulations! On Wed, May 15, 2019 at 12:24 PM Niklas Hansson < niklas.sven.hans...@gmail.com> wrote: > Congratulations Pablo :) > > Den ons 15 maj 2019 kl 21:21 skrev Ruoyun Huang : > >> Congratulations, Pablo! >> >> *From: *Charles Chen >> *Date: *Wed, May 15, 2019 at 11:04 AM >> *To: *dev

Re: [ANNOUNCE] New committer announcement: Udi Meiri

2019-05-03 Thread Heejong Lee
Congratulations! On Fri, May 3, 2019 at 3:53 PM Reza Rokni wrote: > Congratulations ! > > *From: *Reuven Lax > *Date: *Sat, 4 May 2019, 06:42 > *To: *dev > > Thank you! >> >> On Fri, May 3, 2019 at 3:15 PM Ankur Goenka wrote: >> >>> Congratulations Udi! >>> >>> On Fri, May 3, 2019 at 3:00 PM

Re: Artifact staging in cross-language pipelines

2019-04-23 Thread Heejong Lee
2019년 4월 23일 (화) 오전 2:07, Robert Bradshaw 님이 작성: > I've been out, so coming a bit late to the discussion, but here's my > thoughts. > > The expansion service absolutely needs to be able to provide the > dependencies for the transform(s) it expands. It seems the default, > foolproof way of doing

Re: [ANNOUNCE] New committer: Robert Burke

2019-07-16 Thread Heejong Lee
Congratulations! On Tue, Jul 16, 2019 at 1:34 PM Chamikara Jayalath wrote: > Congrats!! > > On Tue, Jul 16, 2019 at 1:31 PM Robin Qiu wrote: > >> Congrats, Robert!! >> >> On Tue, Jul 16, 2019 at 1:22 PM Alan Myrvold wrote: >> >>> Congrats, Robert! >>> >>> On Tue, Jul 16, 2019 at 11:46 AM

Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-26 Thread Heejong Lee
Congratulations! :) On Mon, Aug 26, 2019 at 2:44 PM Rui Wang wrote: > Congratulations! > > > -Rui > > On Mon, Aug 26, 2019 at 2:36 PM Hannah Jiang > wrote: > >> Congratulations Valentyn, well deserved! >> >> On Mon, Aug 26, 2019 at 2:34 PM Chamikara Jayalath >> wrote: >> >>> Congrats

Re: [ANNOUNCE] New committer: Kyle Weaver

2019-08-07 Thread Heejong Lee
Congratulations! On Wed, Aug 7, 2019 at 11:05 AM Tanay Tummalapalli wrote: > Congratulations! > > On Wed, Aug 7, 2019 at 11:27 PM Robin Qiu wrote: > >> Congratulations, Kyle! >> >> On Wed, Aug 7, 2019 at 5:04 AM Valentyn Tymofieiev >> wrote: >> >>> Congrats, Kyle! >>> >>> On Wed, Aug 7, 2019

Re: How to expose/use the External transform on Java SDK

2019-07-24 Thread Heejong Lee
I think it depends how we define "the core" part of the SDK. If we define the core as only the (abstract) data types which describe BEAM pipeline model then it would be more sensible to put external transform into a separate extension module (option 4). Otherwise, option 1 makes sense. On Wed,

***UNCHECKED*** Re: published containers overwrite locally built containers

2019-11-01 Thread Heejong Lee
tag is assumed for each released version. On Fri, Nov 1, 2019 at 10:56 AM Chamikara Jayalath wrote: > I think it makes sense to override published docker images with locally > built versions when testing HEAD. > > Thanks, > Cham > > On Thu, Oct 31, 2019 at 6:31 PM Heejon

Revamping the cross-language validate runner test suite

2019-11-08 Thread Heejong Lee
Hi, I'm working on revamping the cross-language validate runner test suite. Our current test suite for the cross-language transform is incomplete as it only has tests for Wordcount, DoFn, basic Count and basic Filter transforms. My plan is, in addition to our existing set of tests, to add all

published containers overwrite locally built containers

2019-10-31 Thread Heejong Lee
Hi, happy halloween! I'm looking into failing cross language post commit tests: https://issues.apache.org/jira/browse/BEAM-8534 After a few runs, I've found that published SDK harness containers overwrite locally built containers when

Re: published containers overwrite locally built containers

2019-11-06 Thread Heejong Lee
;> >> On Fri, Nov 1, 2019 at 10:56 AM Chamikara Jayalath >> wrote: >> >>> I think it makes sense to override published docker images with locally >>> built versions when testing HEAD. >>> >>> Thanks, >>> Cham >>> >>>

Re: Artifact staging in cross-language pipelines

2019-12-12 Thread Heejong Lee
at 3:54 AM Maximilian Michels wrote: > Hey Heejong, > > I don't think so. It would be great to push this forward. > > Thanks, > Max > > On 26.11.19 02:49, Heejong Lee wrote: > > Hi, > > > > Is anyone actively working on artifact staging extension for &g

External transform API in Java SDK

2019-12-19 Thread Heejong Lee
I wanted to know if anybody has any comment on external transform API for Java SDK. `External.of()` can create external transform for Java SDK. Depending on input and output types, two additional methods are provided: `withMultiOutputs()` which specifies the type of PCollection and

Re: Error logging from fn_api_runners

2020-03-02 Thread Heejong Lee
I think it should be either info or debug but not error. On Mon, Mar 2, 2020 at 2:35 PM Ning Kang wrote: > Hi, > > I just observed some error level loggings like these: > ``` > ERROR:apache_beam.runners.portability.fn_api_runner:created 1 workers > {'worker_5': > at 0x127fdaa58>} >

Re: [DISCUSS][PROPOSAL] Improvements to the Apache Beam website

2020-01-27 Thread Heejong Lee
On Mon, Jan 27, 2020 at 11:19 AM Aizhamal Nurmamat kyzy wrote: > Hi Alexey, > > Answers are inline: > > Do we have any user demands for documentation translation into other >> languages? I’m asking this because, in my experience, it’s quite tough work >> to translate everything and it won’t be

Enabling a new Jenkins job

2020-02-05 Thread Heejong Lee
I created a new Jenkins job in my PR[1] and the new job shows "This project is currently disabled"[2]. Does anybody know how to enable the new job? [1]: https://github.com/apache/beam/pull/10758 [2]: https://builds.apache.org/job/beam_PostCommit_XVR_Spark/

Re: Enabling a new Jenkins job

2020-02-05 Thread Heejong Lee
Fixed. Seed job was overridden by another scheduled seed job. Thanks, Udi! On Wed, Feb 5, 2020 at 2:04 PM Heejong Lee wrote: > I created a new Jenkins job in my PR[1] and the new job shows "This > project is currently disabled"[2]. Does anybody know how to enable the > new

Re: [ANNOUNCE] New committer: Hannah Jiang

2020-01-28 Thread Heejong Lee
Congratulations! :) On Tue, Jan 28, 2020 at 4:43 PM Yichi Zhang wrote: > Congrats Hannah! > > On Tue, Jan 28, 2020 at 3:57 PM Yifan Zou wrote: > >> Congratulations Hannah!! >> >> On Tue, Jan 28, 2020 at 3:55 PM Boyuan Zhang wrote: >> >>> Thanks for all your contributions! Congratulations~ >>>

Re: Cross-language pipelines status

2020-02-11 Thread Heejong Lee
On Tue, Feb 11, 2020 at 9:37 AM Alexey Romanenko wrote: > Hi all, > > I just wanted to ask for more details about the status of cross-language > pipelines (rather, transforms). I see some discussions about that here, but > I think it’s more around cross-language IOs. > > I’ll appreciate for any

Re: External transform API in Java SDK

2020-01-02 Thread Heejong Lee
: https://issues.apache.org/jira/browse/BEAM-9048 On Mon, Dec 30, 2019 at 10:27 AM Luke Cwik wrote: > > > On Mon, Dec 23, 2019 at 12:20 PM Heejong Lee wrote: > >> >> >> On Fri, Dec 20, 2019 at 11:38 AM Luke Cwik wrote: >> >>> What do side inputs lo

Re: External transform API in Java SDK

2019-12-23 Thread Heejong Lee
DoFn.ProcessContext c) { out.output(x + c.sideInput(sideView)); } }) .withSideInputs(sideView)); > On Thu, Dec 19, 2019 at 4:39 PM Heejong Lee wrote: > >> I wanted to know if anybody has an

Re: No space left on device - beam-jenkins 1 and 7

2020-03-11 Thread Heejong Lee
Still seeing no space left on device errors on jenkins-7 (for example: https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2754/) On Fri, Mar 6, 2020 at 7:11 PM Alan Myrvold wrote: > Did a one time cleanup of tmp files owned by jenkins older than 3 days. > Agree that we need a

Re: Re-running GitHub Actions jobs

2020-09-03 Thread Heejong Lee
On Thu, Sep 3, 2020 at 11:05 AM Brian Hulette wrote: > The new GitHub Actions workflows that run Java and Python tests against > different targets (macos, ubuntu, windows) are great! But just like our > Jenkins infra they flake occasionally. Should we be re-running all of these > jobs until we

Re: [ANNOUNCE] New committer: Reza Ardeshir Rokni

2020-09-10 Thread Heejong Lee
Congratulations! On Thu, Sep 10, 2020 at 4:42 PM Robert Bradshaw wrote: > Thank you and welcome, Reza! > > On Thu, Sep 10, 2020 at 4:00 PM Ahmet Altay wrote: > >> Congratulations Reza! And thank you for your contributions! >> >> On Thu, Sep 10, 2020 at 3:59 PM Chamikara Jayalath >> wrote: >>

Re: Jira components for cross-language transforms

2020-05-28 Thread Heejong Lee
If we use one meta component tag for all xlang related issues, I would prefer just "xlang". Then we could attach the "xlang" tag to not only language specific sdk tags but also other runner tags e.g. ['xlang', 'io-java-kafka'], ['xlang'', 'runner-dataflow']. On Thu, May 28, 2020 at 7:49 PM Robert

Re: XLang sub-graph representation within the SDKs pipeline types

2020-07-02 Thread Heejong Lee
On Wed, Jul 1, 2020 at 7:18 PM Robert Burke wrote: > From the Go SDK side, it was built that way nearly from the start. > Historically there was a direct SDK rep -> Dataflow rep conversion, but > that's been replaced with a SDK rep -> Beam Proto -> Dataflow rep > conversion. > > In particular,

Re: Beam Jenkins Migration

2020-06-18 Thread Heejong Lee
This is awesome. Could non-committers also trigger the test now? On Wed, Jun 17, 2020 at 6:12 AM Damian Gadomski wrote: > Hello, > > Good news, we've just migrated to the new CI: https://ci-beam.apache.org. > As from now beam projects at builds.apache.org are disabled. > > If you experience any

Re: Python SDK ReadFromKafka: Timeout expired while fetching topic metadata

2020-06-08 Thread Heejong Lee
> >> Seems like Java dependency is not being properly set up when running the >> cross-language Kafka step. I don't think this was available for Beam 2.21. >> Can you try with the latest Beam HEAD or Beam 2.22 when it's released ? >> +Heejong Lee >> >&

Re: [VOTE] Release 2.23.0, release candidate #1

2020-07-15 Thread Heejong Lee
wse/BEAM-10397 before giving my >> vote. >> > > +Heejong Lee to comment on this. > > >> >> On Wed, Jul 15, 2020 at 10:51 AM Pablo Estrada >> wrote: >> > >> > +1 >> > I was able to run the python 3.8 quickstart from wheels on Dir

Re: [VOTE] Release 2.30.0, release candidate #1

2021-06-08 Thread Heejong Lee
, Jun 8, 2021 at 12:45 PM Kenneth Knowles wrote: > +1 (binding) > > Verified wordcount with various configuration parameters and they all > worked. Particularly confirming that all the containers are chosen > correctly. > > Kenn > > On Tue, Jun 8, 2021 at

Re: [VOTE] Release 2.30.0, release candidate #1

2021-06-08 Thread Heejong Lee
. > Sure. Please let me know when you finish the validation. > > Kenn > > On Mon, Jun 7, 2021 at 4:28 PM Heejong Lee wrote: > >> FYI, we now have three binding votes and I will close the vote tomorrow >> morning. >> >> The RC build is validated for most qu

Re: [VOTE] Release 2.30.0, release candidate #1

2021-06-07 Thread Heejong Lee
gt; >> > >> Thanks, > >> Cham > >> > >> On Thu, Jun 3, 2021 at 2:03 PM Tomo Suzuki wrote: > >>> > >>> +1 (non-binding) > >>> > >>> Thank you for the preparation. With the GCP dependencies of my > inte

[NEED HELP] PMC only finalization items for release 2.30.0

2021-06-09 Thread Heejong Lee
Hi, I'm finishing 2.30.0 release and need help doing PMC only finalization items in the release guide ( https://beam.apache.org/contribute/release-guide/#10-finalize-the-release). Please let me know if any PMC members have some time to do these tasks :) Thanks!

Re: [PROPOSAL] Preparing for Beam 2.30.0 release

2021-05-13 Thread Heejong Lee
%20Implementation%22%2C%20%22Triage%20Needed%22)%20AND%20fixVersion%20%3D%202.30.0 I will start building the RC release after we cherry-pick the last blocker. On Thu, Apr 29, 2021 at 12:48 AM Heejong Lee wrote: > We have 10 open issues for Fix Version 2.30.0: > https://issues.apache.org/jira/

Need maintainer permission of PyPI apache-beam package

2021-05-18 Thread Heejong Lee
Hi, I'm currently working on Beam 2.30.0 release and need help adding myself to the maintainer group of PyPI apache-beam package. My PyPI username is 'ihji'. Does anybody have permission for adding a new member to the apache-beam maintainer group? Thanks!

Re: Need maintainer permission of PyPI apache-beam package

2021-05-19 Thread Heejong Lee
It's done. Thanks Pablo! On Wed, May 19, 2021 at 11:21 AM Pablo Estrada wrote: > I've sent you an invite to be a project maintainer. Let me know if that > works. > Best > -P. > > On Tue, May 18, 2021 at 6:28 PM Heejong Lee wrote: > >> Hi, >> >> I'm c

[NEED HELP] Populating the change list for 2.30.0 release

2021-05-27 Thread Heejong Lee
Hi Beam developers, I'm gathering the information for the changes in the 2.30.0 release. If you have any idea about important *new features* / *breaking changes* / *deprecation* / *known issues* for the 2.30.0 release, please note down them in CHANGES.md or just let me know. Thanks!

[VOTE] Release 2.30.0, release candidate #1

2021-06-03 Thread Heejong Lee
Hi everyone, Please review and vote on the release candidate #1 for the version 2.30.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) Reviewers are encouraged to test their own use cases with the release candidate, and vote +1 if no

Re: [PROPOSAL] Preparing for Beam 2.30.0 release

2021-04-26 Thread Heejong Lee
gt; because each release represents a certain quantity of changes. But in this > case, the actual quantity of changes is affected by the re-cut, too. > > > > > > On Wed, Apr 21, 2021 at 4:12 PM Heejong Lee > wrote: > > >> > > >> Update on the 2.30.0

Re: [PROPOSAL] Preparing for Beam 2.30.0 release

2021-04-21 Thread Heejong Lee
e, Apr 20, 2021 at 4:55 PM Heejong Lee wrote: > >> Hi All, >> >> Beam 2.30.0 release is scheduled to be cut on April 21 according to the >> release calendar [1] >> >> I'd like to volunteer myself to be the release manager for this release. >> I pl

Re: [PROPOSAL] Preparing for Beam 2.30.0 release

2021-04-29 Thread Heejong Lee
t; On Mon, Apr 26, 2021 at 2:33 PM Heejong Lee wrote: > >> >> >> On Mon, Apr 26, 2021 at 10:24 AM Robert Bradshaw >> wrote: >> >>> Confirming that the cut date is 4/28/2021 (in two days), right? >>> >> >> Yes, 2.30.0 branch is schedu

Re: [PROPOSAL] Preparing for Beam 2.30.0 release

2021-04-29 Thread Heejong Lee
AM Heejong Lee wrote: > FYI, I just cut the 2.30.0 release branch. From now on, late commits for > 2.30.0 need to be cherry-picked. If you have any late commits, please make > sure that their Jira issues have the correct Fix Version, 2.30.0. > > On Tue, Apr 27, 2021 at 7:52 AM

[PROPOSAL] Preparing for Beam 2.30.0 release

2021-04-20 Thread Heejong Lee
Hi All, Beam 2.30.0 release is scheduled to be cut on April 21 according to the release calendar [1] I'd like to volunteer myself to be the release manager for this release. I plan on cutting the release branch on the scheduled date. Any comments or objections ? Thanks, Heejong [1]