I can answer for the case of SolrIO and ElasticsearchIO, Luke.
Retrying in SolrIO was my first contribution to Beam and I see in the PR
[1] that I was just copying JdbcIO for styling. ElasticsearchIO then
followed suit.
Exposing FluentBackoff seems sensible to me.
[1]
with the
reshuffle example and parallelism.
Thanks,
Tim
On Thu, Oct 3, 2019 at 1:21 PM Jan Lukavský wrote:
> Hi Tim,
>
> can you please elaborate more about some parts?
>
> 1) What happens actually in your case? What is the specific settings you
> use?
>
> 3)
+1, I'd love to see this as a recording. Will you stick it up on youtube
afterwards?
On Thu, Jul 18, 2019 at 4:00 AM sridhar inuog
wrote:
> Thanks, Pablo! Looking forward to it! Hopefully, it will also be recorded
> as well.
>
> On Wed, Jul 17, 2019 at 2:50 PM Pablo Estrada wrote:
>
>> Yes! So
Congratulations Robert!
On Wed, Jul 17, 2019 at 2:47 PM Gleb Kanterov wrote:
> Congratulations, Robert!
>
> On Wed, Jul 17, 2019 at 1:50 PM Robert Bradshaw
> wrote:
>
>> Congratulations!
>>
>> On Wed, Jul 17, 2019, 12:56 PM Katarzyna Kucharczyk <
>> ka.kucharc...@gmail.com> wrote:
>>
>>>
to explore the helpdesk of dataflow. I notice for
example others report this on SO
https://stackoverflow.com/questions/30189691/dataflow-zombie-jobs-stuck-in-not-started-state
I hope this is somewhat useful,
Tim
On Thu, Jun 27, 2019 at 8:12 AM Chaim Turkel wrote:
> since the night all my jobs that i
Congratulations Mikhail!
On Fri, Jun 21, 2019 at 12:37 PM Robert Burke wrote:
> Congrats
>
> On Fri, Jun 21, 2019, 12:29 PM Thomas Weise wrote:
>
>> Hi,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Mikhail Gryzykhin.
>>
>> Mikhail has been contributing to
This is great. Thanks Pablo and all
I've seen several folk struggle with writing avro to dynamic locations
which I think might be a good addition. If you agree I'll offer a PR unless
someone gets there first - I have an example here:
>> On 25 Mar 2019, at 18:36, Mark Liu wrote:
>>
>> Thank you all! It's a great pleasure to work on Beam!
>>
>> Mark
>>
>> On Mon, Mar 25, 2019 at 10:18 AM Robin Qiu wrote:
>>
>> Congratulations, Mark!
>>
>> On Mon, Mar 25, 2019 at 9:31 A
Thank you for running the release Andrew
On Thu, Apr 25, 2019 at 8:24 PM Andrew Pilloud wrote:
> I reran the Nexmark tests, each runner passed. I compared the numbers
> on the direct runner to the dashboard and they are where they should
> be.
>
> With that, I'm happy to announce that we have
Congratulations Yifan!
On Mon, Apr 22, 2019 at 5:39 PM Cyrus Maden wrote:
> Congratulations Yifan!!
>
> On Mon, Apr 22, 2019 at 11:26 AM Kenneth Knowles wrote:
>
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a new committer:
>> Yifan Zou.
>>
>> Yifan has been
Many congratulations Boyuan!
On Thu, Apr 11, 2019 at 10:50 AM Łukasz Gajowy wrote:
> Congrats Boyuan! :)
>
> śr., 10 kwi 2019 o 23:49 Chamikara Jayalath
> napisał(a):
>
>> Congrats Boyuan!
>>
>> On Wed, Apr 10, 2019 at 11:14 AM Yifan Zou wrote:
>>
>>> Congratulations Boyuan!
>>>
>>> On Wed,
Congratulations Mark!
On Mon, Mar 25, 2019 at 3:18 PM Michael Luckey wrote:
> Nice! Congratulations, Mark.
>
> On Mon, Mar 25, 2019 at 2:42 PM Katarzyna Kucharczyk <
> ka.kucharc...@gmail.com> wrote:
>
>> Congratulations, Mark!
>>
>> On Mon, Mar 25, 2019 at 11:24 AM Gleb Kanterov wrote:
>>
Congrats Raghu
On Thu, Mar 7, 2019 at 7:09 PM Ahmet Altay wrote:
> Congratulations!
>
> On Thu, Mar 7, 2019 at 10:08 AM Ruoyun Huang wrote:
>
>> Thank you Raghu for your contribution!
>>
>>
>>
>> On Thu, Mar 7, 2019 at 9:58 AM Connell O'Callaghan
>> wrote:
>>
>>> Congratulation Raghu!!! Thank
Congrats Michael and welcome.
On Thu, Feb 28, 2019 at 7:41 AM Gleb Kanterov wrote:
> Congratulations and welcome!
>
> On Wed, Feb 27, 2019 at 8:57 PM Connell O'Callaghan
> wrote:
>
>> Excellent thank you for sharing Kenn!!!
>>
>> Michael congratulations for this recognition of your
What a shame for the project but best of luck for the future Scott.
Thanks for all your contributions - they have been significant!
Tim
On Thu, Feb 14, 2019 at 7:37 PM Scott Wegner wrote:
> I wanted to let you all know that I've decided to pursue a new adventure
> in my career, which wil
Congratulations Etienne!
Tim
> On 25 Jan 2019, at 23:00, Kenneth Knowles wrote:
>
> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming Etienne Chauchot to
> join the PMC.
>
> Etienne introduced himself to dev@ in September of 2017 and over the
Thanks Kenn
I tend to think that timing is the main contributing factor as you note on
the Jira - it slipped down with no reminders / bumps sent on any channels
that I can see.
Would something that alerts the dev@ list of PRs that have not received any
attention after N days be helpful perhaps?
Welcome Gleb and congratulations!
On Fri, Jan 25, 2019 at 8:06 AM Kenneth Knowles wrote:
> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new committer: Gleb
> Kanterov
>
> Gleb started contributing to Beam and quickly dove deep, doing some
> sensitive fixes to schemas,
Thank you for running the release Chamikara.
Tim,
Sent from my iPhone
> On 14 Dec 2018, at 10:30, Matt Casters wrote:
>
> Great news! Congratulations!
> My experience venturing into the world of Apache Beam couldn't possibly have
> been nicer. Thank you to all involve
Thanks Cham
+1
> On 16 Nov 2018, at 05:30, Thomas Weise wrote:
>
> +1
>
>
>> On Thu, Nov 15, 2018 at 4:34 PM Ahmet Altay wrote:
>> +1 Thank you.
>>
>>> On Thu, Nov 15, 2018 at 4:22 PM, Kenneth Knowles wrote:
>>> SGTM. Thanks for keeping track of the schedule.
>>>
>>> Kenn
>>>
On
Thanks for raising this Anton
> It would be very easy to forward new SO questions to the user@ list, or
> a new list if we're worried about the noise.
+1 (preference on user@ until there are too many)
On Mon, Nov 5, 2018 at 7:18 PM Scott Wegner wrote:
> I like the idea of working to
Congratulations and welcome!
Tim
> On 1 Nov 2018, at 17:06, Matthias Baetens wrote:
>
> Congrats David!!!
>
>> On Thu, Nov 1, 2018, 16:04 Kenneth Knowles wrote:
>> Hi all,
>>
>> Please join me and the rest of the Beam PMC in welcoming a ne
Everything worked, and performance was similar on both.
We built using maven pointing at
https://repository.apache.org/content/repositories/orgapachebeam-1049/
Based on this limited testing: +1
Thank you to the release managers,
Tim
On Thu, Oct 25, 2018 at 7:21 PM Tim wrote:
> I can do some te
I can do some tests on Spark / YARN tomorrow (CEST timezone). Sorry I’ve just
been too busy to assist.
Tim
> On 25 Oct 2018, at 18:59, Kenneth Knowles wrote:
>
> I tried to do a more thorough job on this.
>
> - I could not reproduce the slowdown in Query 9. I belie
> > - Łukasz Gajowy, testing infrastructure, benchmarks, build system
> > improvements
> > - Anton Kedin, contributor to SQL and schemas, helper on
> StackOverflow
> > - Andrew Pilloud, contributor to SQL, very active on dev@, infra
> > and release he
the style?
Tim
On Fri, Oct 12, 2018 at 6:35 PM Kenneth Knowles wrote:
> Personally, I think cwiki is best for dev community, while important stuff
> for users should go on the web site. But experimenting with the content on
> cwiki seems like a quick and easy thing to try out.
>
&
Thank you JB for starting this discussion.
Others comment on many of these points far better than I can, but my
experience is similar to JB.
1. IDEA integration (and laptop slowing like crazy) being the biggest
contributor to my feeling of being unproductive
2. Not knowing the correct way to
gt;> release cut date. Is there something we should do to better communicate
>> release cuts in the future?
>>
>>
>> https://lists.apache.org/thread.html/c4da2a5594d22121b5864662e64a027148993b5e0187ce5beda2714e@%3Cdev.beam.apache.org%3E
>>
>> Andrew
>>
>>
harles
announced:
I will cut the initial 2.7.0 release branch on September 7.
Is this a case of unfortunate timing (it was cut early?) and we just
overlooked cherry picking that commit do you think?
Do we correct release notes when mistakes are spotted?
Thanks,
Tim
[1]
https://issues.apache.or
I was in the middle of writing something similar when Ismaël posted.
Please do bear in mind that this is an international project and 7hrs is
not long enough to decide upon something that affects us all.
+1 on cutting 2.8.0 on 10/10 and thank you for pushing it forward
-1 on designating it as
(2.6.0): 1.7hrs
- Beam AvroIO with the 5036 fix: 42 minutes
Related: I also anticipate that varying the spark.default.parallelism will
affect Beam runtime.
Thanks,
Tim
[1] https://github.com/apache/beam/pull/6289
[2] https://github.com/gbif/beam-perf/tree/master/avro-to-avro
On Fri, Sep 28
Beam 2.6.0 but unless I'm mistaken the
code exists only on a branch [3] and hasn't been touched for a while.
Thanks,
Tim
[1] https://beam.apache.org/documentation/runners/capability-matrix/
[2] https://beam.apache.org/documentation/runners/mapreduce/
[3] https://github.com/apache/beam/tree/mr-runn
Thank you to Davor all the PMC - I can only imagine how much work it has
been to get Beam to where it is today.
Congratulations Kenn!
On Thu, Sep 20, 2018 at 1:05 AM Tyler Akidau wrote:
> Thanks Davor, and congrats Kenn!
>
> -Tyler
>
> On Wed, Sep 19, 2018 at 2:43 PM Yifan Zou wrote:
>
>>
rds
> > JB
> >
> > On 19/09/2018 05:34, devinduan(段丁瑞) wrote:
> > > Hi,
> > > Thanks for you reply.
> > > Our team plan to use Beam instead of Spark, So I'm testing the
> > > performance of Beam API.
> > &
e of the parallelism). The files are named
/tmp/wordcount-000*-of-00045.
I hope this helps provide a few pointers, but if you elaborate on your
environment we might be able to assist more.
Best wishes,
Tim
On Tue, Sep 18, 2018 at 9:29 AM Robert Bradshaw wrote:
> There are k
+1
> On 15 Sep 2018, at 01:23, Yifan Zou wrote:
>
> +1
>
>> On Fri, Sep 14, 2018 at 4:20 PM David Morávek
>> wrote:
>> +1
>>
>>
>>
>>> On 15 Sep 2018, at 00:59, Anton Kedin wrote:
>>>
>>> +1
>>>
On Fri, Sep 14, 2018 at 3:22 PM Alan Myrvold wrote:
+1
> On Fri, Sep
+1 (non googler)
It sounds pragmatic, helps with transparency should issues arise and
enables more people to fix.
On Thu, Sep 13, 2018 at 8:15 PM Dan Halperin wrote:
> From my perspective as a (non-Google) community member, huge +1.
>
> I don't see anything bad for the community about open
Thanks for sharing Manu - interesting paper indeed.
Tim
> On 10 Sep 2018, at 16:02, Maximilian Michels wrote:
>
> Excellent write-up. Thank you!
>
>> On 09.09.18 20:43, Jean-Baptiste Onofré wrote:
>> Good idea. It could also help people who wants to create runners.
&g
Another +1 for option 3 (and preference of HadoopFormatIO naming).
Thanks Alexey,
Tim
> On 7 Sep 2018, at 19:13, Andrew Pilloud wrote:
>
> +1 for option 3. That approach will keep the mapping clean if SQL supports
> this IO. It would be good to put the proxy in the old mod
ng the most recent versions can be good to be close to the
>>>>>> current development of other projects and some of the fixes, but these
>>>>>> versions are commonly not deployed for most users and adopting a LTS
>>>>>> or stable only appro
from a strategy on which community of users
Beam is targeting?
(OT: I'm collecting some thoughts on what we might consider to target
enterprise hadoop users - kerberos on all relevant IO, performance, leaking
beyond encryption zones with temporary files etc)
Thanks,
Tim
18 at 11:55 AM Jozef Vilcek wrote:
> Just for reference, there is a JIRA open for
> FileBasedSink.moveToOutputFiles() and filesystem move behavior
>
> https://issues.apache.org/jira/browse/BEAM-5036
>
>
> On Wed, Aug 22, 2018 at 9:15 PM Tim Robertson
> wrote:
>
>&g
/FileSystems.java#L288
On Wed, Aug 22, 2018 at 8:52 PM Tim Robertson
wrote:
> > Does HDFS support a fast rename operation?
>
> Yes. From the shell it is “mv” and in the Java API it is “rename(Path src,
> Path dst)”.
> I am not aware of a fast copy though. I think an HDFS copy s
t much
>>> of that graph is never used (empty PCollections).
>>> >> >
>>> >> > On Wed, Aug 22, 2018 at 3:12 AM Robert Bradshaw <
>>> rober...@google.com> wrote:
>>> >> >>
>>> >> >> I agree that this is
gt; On Wed, Aug 22, 2018 at 1:35 PM Tim Robertson
> wrote:
> >
> > Thanks Robert
> >
> > > Have you profiled to see which stages and/or operations are taking up
> the time?
> >
> > Not yet. I'm browsing through the spark DAG produced which I've
>
and/or operations are taking up the
> time?
> On Wed, Aug 22, 2018 at 11:29 AM Tim Robertson
> wrote:
> >
> > Hi folks,
> >
> > I've recently been involved in projects rewriting Avro files and have
> discovered a concerning performance trait in Beam.
> >
>
), a union, a GBK etc.
Before I go too far with exploration I'd appreciate thoughts on whether we
believe this is a concern (I do), if we should explore optimisations or any
insight from previous work in this area.
Thanks,
Tim
[1] https://github.com/gbif/beam-perf/tree/master/avro-to-avro
Thanks for this Vaclav
The failing test (1 minute timeout exception) is something we see sometimes
and indicates issues in the build environment or a flakey test. I triggered
another build by leaving a comment in the PR - just fyi, this is something
you can also do in the future.
On Tue,
+1 for CREATE EXTERNAL TABLE with similar reasoning given by others on this
thread.
Tim
> On 15 Aug 2018, at 23:01, Charles Chen wrote:
>
> +1 for CREATE EXTERNAL TABLE. It is a good balance between the general SQL
> expectation of having tables as an abstraction and reinforci
+1 (non binding)
With apologies to Valentyn and others, but only had time to test what was
feasible for us this week. Tested our existing pipelines using 2.6.0RC1
which source and sink using AvroIO / HDFS on Spark 2.3 (Cloudera) ran
without issue - our project tests all pass with 2.6.0RC1.
We'd
I took a pass at reviewing (non committer). I haven't worked on unbounded
IO so wasn't familiar enough with the timestamp and checkpointing but
otherwise it LGTM in general - thanks John and for applying the minor
suggestions.
OT: Reuven, if you have time on your hands there is also the KuduIO
ons or improvement requests on that jira.
The offer to assist in your first PR remains open for the future - please
don't hesitate to ask.
Thanks,
Tim
[1]
https://github.com/jsteggink/beam/tree/BEAM-3199/sdks/java/io/elasticsearch-6/src/main/java/org/apache/beam/sdk/io/elasticsearch
[2] ht
Hi Udi
I asked the GH helpdesk and they confirmed that only people with write
access will actually be automatically chosen.
It don't expect it should stop us using it, but we should be aware that
there are non-committers also willing to review.
Thanks,
Tim
On Thu, Jul 12, 2018 at 7:24 PM
Thanks for this Yifan,
I've added my name to all Hadoop related dependencies, solr, along with es.
On Thu, Jun 28, 2018 at 3:28 PM, Etienne Chauchot
wrote:
> I've added myself and @Tim Robertson on elasticsearchIO related deps.
>
> Etienne
>
> Le mercredi 27 juin 2018 à 14:05 -
Thanks also to you Scott
Tim
> On 27 Jun 2018, at 18:39, Scott Wegner wrote:
>
> Six weeks ago [1] we began an effort to improve the quality of the Java
> codebase via ErrorProne static analysis, and promoting compiler warnings to
> errors. As of today, all of our Java pro
++1
On Wed, Jun 27, 2018 at 7:36 AM, Ahmet Altay wrote:
> +1
>
> This is great idea. Does anyone know a similar tool for python? I believe
> go already has this as part of its tools with go fmt.
>
>
> On Tue, Jun 26, 2018 at 9:55 PM, Ankur Goenka wrote:
>
>> +1
>>
>> Intellij can help but
the dependency at the 1.3.9-1. I believe our general
direction is to remove findbugs when errorprone covers all aspects so I
*expect* this should be considered reasonable.
I hope this helps,
Tim
[1] https://github.com/stephenc/findbugs-annotations/issues/4
[2] https://maven.apache.org/guides/mini
ve checked is in the 3.0.1-1 build [2]
I notice in your commits [1] you've been exploring version 3.0.0 already
though... what happens when you use 3.0.1-1? It sounds like the wrong
version is coming in rather than the annotation being missing.
Thanks,
Tim
[1]
https://github.com/stephenc/findbugs-a
Tested by our team:
- mvn inclusion
- Avro, ES, Hadoop IF IO
- Pipelines run on Spark (Cloudera 5.12.0 YARN cluster)
- Reviewed release notes
+1
Thanks also to everyone who helped get over the gradle hurdle and in particular
to JB.
Tim
> On 9 Jun 2018, at 05:56, Jean-Baptiste Onofré wr
Congratulations!
Tim
> On 1 Jun 2018, at 07:05, Andrew Psaltis wrote:
>
> Congrats!
>
>> On Fri, Jun 1, 2018 at 12:26 AM, Thomas Weise wrote:
>> Congrats!
>>
>>
>>> On Thu, May 31, 2018 at 9:25 PM, Alan Myrvold wrote:
>>> Congrats Gri
us limitations there but it really improves code quality and prevents
>> blunders. I'm not sure errorprone covers that. I know the Checker analyzer
>> has a full solution that makes NPE impossible as in most modern languages.
>> Maybe that is easy to plug in. The core Java SDK
?
Thanks,
Tim
[1]
https://lists.apache.org/thread.html/95aae2785c3cd728c2d3378cbdff2a7ba19caffcd4faa2049d2e2f46@%3Cdev.beam.apache.org%3E
have seen include:
- archive is not a ZIP archive
- invalid block type
- too many length or distance symbols
- It is using the zip reader org.apache.tools.zip.ZipFile (from Ant I
believe)
I hope this helps,
Tim
On Thu, May 17, 2018 at 3:15 PM, Etienne Chauchot <ech
You're very welcome. Glad you have it sorted.
On Fri, May 11, 2018 at 12:48 PM, Carlos Alonso <car...@mrcalonso.com>
wrote:
> Hi Tim, many thanks for your help. It's definitely interesting, but
> unfortunately not useful this time, I think, as that JsonTypeInfo and
> JsonSubClas
instructs the deserializers which Object to instantiate. I'm not sure if
newer Jackson versions have changed.
I haven't considered if this is appropriate or not in your case, but I hope
this helps with the Jackson bit of your question at least.
Best wishes,
Tim
On Wed, May 9, 2018 at 7:02
Will do - I'll report the result on https://github.com/apache/beam/pull/4905
On Thu, Apr 5, 2018 at 11:45 AM, Ismaël Mejía <ieme...@gmail.com> wrote:
> For info, Romain's PR was merged today, can you confirm if this fixes
> the issue Tim.
>
> On Sun, Apr 1, 2018 at 9:21
it:
>
>> Correct - teardown is currently run in the direct runner, but
>> asynchronously. I believe Romain's pending PRs should solve this for your
>> use case.
>>
>> On Sun, Apr 1, 2018 at 3:13 AM Tim Robertson < timrobertson...@gmail.com>
>> wrote:
>>
Hi,
I would like to get started with contributing and thought I'd start
with this, if that is ok:
https://issues.apache.org/jira/browse/BEAM-1056
Could somebody please assign it to me?
Best regards,
Tim
68 matches
Mail list logo