I saw this and was particularly excited about the new support for
"external" transforms in portable runners like python (i.e. the ability to
use the Java KafkaIO transforms, with presumably more to come in the
future). While the release notes are useful, I will say that it takes a
lot of time and
and now the news in on the twitterwebs
https://twitter.com/datancoffee/status/1137160729386074113
On Fri, Jun 7, 2019 at 5:52 PM Reza Rokni wrote:
> +1 on the pattern Tim!
>
> Please raise a Jira with the label pipeline-patterns, details are here:
>
>
+1 on the pattern Tim!
Please raise a Jira with the label pipeline-patterns, details are here:
https://beam.apache.org/documentation/patterns/overview/#contributing-a-pattern
On Sat, 8 Jun 2019 at 05:04, Tim Robertson
wrote:
> This is great. Thanks Pablo and all
>
> I've seen several folk
Awesome! Thanks for leading the release Ankur.
On Fri, Jun 7, 2019 at 2:57 PM Ankur Goenka wrote:
> The Apache Beam team is pleased to announce the release of version 2.13.0!
>
> Apache Beam is an open source unified programming model to define and
> execute data processing pipelines, including
This is great. Thanks Pablo and all
I've seen several folk struggle with writing avro to dynamic locations
which I think might be a good addition. If you agree I'll offer a PR unless
someone gets there first - I have an example here:
Hello everyone,
A group of community members has been working on gathering and providing
common pipeline patterns for pipelines in Beam. These are examples on how
to perform certain operations, and useful ways of using Beam in your
pipelines. Some of them relate to processing of files, use of side
Thanks for the doc. This is really clear and readable. It all looks like a
good improvement, whatever the result of the various open threads. And nice
bonus that you've pointed to more good reading material.
Kenn
On Fri, Jun 7, 2019 at 12:25 PM Alireza Samadian
wrote:
> Thank you so much.
>
>
Nice. I noticed the huge drop in untriaged issues. Both of those ideas for
automation sound reasonable.
I think the other things that are harder to optimize can probably be
addressed by re-triaging stale bugs. We will probably find those that
should have been closed and those that are just
Thank you so much.
Best,
Alireza
On Fri, Jun 7, 2019 at 11:48 AM Pablo Estrada wrote:
> I've added you as a contributor! : )
>
> On Fri, Jun 7, 2019 at 11:20 AM Alireza Samadian
> wrote:
>
>> Hi,
>>
>> I am going to create Issues in Jira and start implementing row estimation
>> of each source
I've added you as a contributor! : )
On Fri, Jun 7, 2019 at 11:20 AM Alireza Samadian
wrote:
> Hi,
>
> I am going to create Issues in Jira and start implementing row estimation
> of each source separately. I will appreciate if someone gives me the
> permission to assign Jira Issues to myself.
I agree with you. A more recent LTS release with python 2 support will be
good. Cost of maintaining python 2 support is also fairly low (maybe zero
actually besides keeping some pre-existing compatibility code).
I believe we are referring to two separate things with support:
- Supporting existing
Hi,
I am going to create Issues in Jira and start implementing row estimation
of each source separately. I will appreciate if someone gives me the
permission to assign Jira Issues to myself. My Jira id is riazela.
Best,
Alireza
On Fri, May 31, 2019 at 3:54 PM Alireza Samadian
wrote:
> Dear
The topic of schema registries probably does not block the design and
implementation of logical types and portable schemas by themselves, however
I think we should spend some time discussing it (probably in a separate
thread) so that all SDKs have similar mechanisms for schema registration
and
We have been currently been having every runner define and manage its own
suite/tests so yes modifying flink_runner.gradle is currently the correct
thing to do.
There is a larger discussion about whether this is the right way since we
would like to capture things like perf benchmarks and
Here is an idea how this could be done: Create a JIRA ticket that will
always remain open. Have folks append their suggested tweets as comments.
Interested PMC members can watch that ticket.
Thomas
On Thu, Jun 6, 2019 at 10:41 AM Thomas Weise wrote:
> Pinging individual PMC members doesn't
I also noticed that the build takes significantly less time on my machine,
several mins saved.
On Fri, Jun 7, 2019 at 9:54 AM Lukasz Cwik wrote:
> Guava was the only thing that we shaded everywhere but the original intent
> was for us to shade more and more by default until we decided to do
>
Guava was the only thing that we shaded everywhere but the original intent
was for us to shade more and more by default until we decided to do
vendoring (which is a better solution).
So yes, this really only removed shading of Guava, we still have shading in
all these other places:
model/*
Even though we don't support iteration, one could have a known upperbound
and "unroll" the loop to a fixed number of iterations statically before the
pipeline is run but I agree with Eugene on his other points.
On Fri, Jun 7, 2019 at 3:59 AM Robert Burke wrote:
> I'm not sure I understand
Wouldn't SDK specific types always be under the "coders" component instead
of the logical type listing?
Offhand, having a separate normalized listing of logical schema types in
the pipeline components message of the types seems about right. Then
they're unambiguous, but can also either refer to
I'm not sure I understand the desired properties of GroupByMultiKey.
Offhand, am I right interpreting GroupByMultiKey as essentially forming a
graph of the keys based on the MultiKeys nodes, and the number of resulting
iterables is based on the components of the graph.
If that's the case then,
It looks like you want to take a PCollection of lists of items of the same
type (but not necessarily of the same length - in your example you pad them
to the same length but that's unnecessary), induce an undirected graph on
them where there's an edge between XS and YS if they have an element in
Hi,
that sounds interesting, but it seems to be computationally intensive
and might not be well scalable, if I understand it correctly. It looks
like it needs a transitive closure, am I right?
Jan
On 6/7/19 11:17 AM, i.am.moai wrote:
Hello everyone, nice to meet you
I am Naoki Hyu(日宇尚記).
Sounds like a good idea. I think the same can be done for Flink; Flink's and
Spark's APIs are similar to a large degree.
Here also a link to the transforms:
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/
-Max
On 04.06.19 03:20, Ahmet Altay wrote:
Thank you
Created an up-to-date version of the Flink backports for 2.7.1:
https://github.com/apache/beam/pull/8787
Some of the Gradle task names have changed which makes testing via Jenkins
hard. Will have to run them manually before merging.
-Max
On 06.06.19 17:41, Kenneth Knowles wrote:
Hi all,
Hello everyone, nice to meet you
I am Naoki Hyu(日宇尚記). a developer live in Tokyo. I often use scala and
python as my favorite language .
I have no experience with OSS development, but as I use DataFlow at work, I
want to contribute to the development of Beam.
In fact, there is a feature I want
I don't think the second release with robust/recommended Python 3
support should be the last release with Python 2 support--that is
simply not enough time for people to migrate. (Look at how long it
took us...) It does make a lot of sense to at least have one LTS
release with support for both.
This is fantastic. Took a look at the PR and did not see anything that
jump to my eyes and also validated with two external projects with
today’s snapshots (after merge) without issues so far. Great that we
finally tackle this on, thanks Luke!
Have one minor comment because the title of the
I took a look and reduced the untriaged issues to around 100. I
noticed however some patterns that are producing more untriaged issues
that we should have. Those can be probably automated (if JIRA has ways
to do it):
1. Issues created and assigned on creation can be marked as open.
2. Once an
Hi Reza, interesting suggestions, thanks.
When you mentioned join, I recalled an older issue (which apparently was
not yet transfered to Beam's JIRA) [1]. Is this anyhow related to what
you are implementing? Would you like to make your implementation
accessible via Euphoria DSL [2]?
Jan
29 matches
Mail list logo