; > components
>> > > and create additional tests as appropriate
>> > >
>> > > * Besides of integration tests in package
>> > org.apache.beam.sdk.extensions.sql,
>> > > there's another example in org.apache.beam.sdk.extensions.sql.ex
The current master has accumulated a good amount of nice features
since 2.1.0 so a new release is welcomed. I have two JIRAs/PR that I
think are important to check/solve before the cut:
BEAM-2516 (this is a regression on the performance of Direct runner on
Java). We had never really defined if a
+1 (non-binding)
- Validated signatures OK
- mvn clean verify -Prelease on both OpenJDK 1.7 and Oracle JDK 8 with
the docker development images (WIP), both OK
- Run WordCount on local Flink and Spark runners OK
Everything looks nice, only one minor thing (not blocking at all). The
proto
Kenneth’s idea of using sketches for state with the State API is
really interesting, it really opens some interesting use cases, I
haven’t really thought about it but I believe it is really an
appealing use case for the sketches. Note that the origin of this work
was in the line of statistics, in
Congrats everyone, well deserved, excellent work guys !
On Fri, Aug 11, 2017 at 7:53 PM, Jesse Anderson
wrote:
> Welcome!
>
> On Fri, Aug 11, 2017, 10:48 AM Jason Kuster
> wrote:
>
>> Congrats to all, many thanks for the great
Not a blocker but maybe it is worth considering the fix for
https://issues.apache.org/jira/browse/BEAM-2587 too.
I also was bitten by this issue and I could only get it to work by
doing a 'pip install --user grpcio-tools' (not sure if this is a
proper solution but it works for me), however when I
ey basis.
>>
>> Interestingly, the "watch mutations" command would allow one to build a
>> streaming memcache IO which shows all changes occurring underneath.
>>
>> memcached protocol:
>> https://github.com/memcached/memcached/blob/master/doc/pr
Cody not sure if I follow, but isn't Distribution on Beam similar to
codahale/dropwizard's HIstogram (without the quantiles) ?
Meters are also in the plan but not implemented yet, see the Metrics design doc:
https://s.apache.org/beam-metrics-api
If I understand what you want is to have some sort
Hello Reuven,
I finally took the time to read the Drain proposal, thanks a lot for
bringing this, it looks like a nice fit with the current APIs and it
would be great if this could be implemented as much as possible in a
Runner independent way.
I am eager now to see the snapshot and update
O test code, I'd
> propose that we add comments in there directing people to the correct
> native source.
>
> S
> [1] writeThenRead style IO IT -
> https://lists.apache.org/thread.html/26ee3ba827c2917c393ab26ce97e7491846594d8f574b5ae29a44551@%3Cdev.beam.apache.org%3E
>
> On Tue, May 3
The whole goal of this discussion is that we define what shall we do
when someone wants to add a new IO that uses HIFIO. The consensus so
far following the PR comments + this thread is that it should be
discouraged and those contribution be included as documentation in the
website, and that we
adable" IOs (better name suggestions appreciated :) -
> that could include a list of data stores that jdbc/jms/hifio support and
> link to HIFIO's info on how to use them. (That might also be a good place
> to document the performance tradeoffs of using HIFIO)
>
> S
>
>
Hello,
I created a new JIRA for this native implementation of the IO so feel
free to PR the 'native' implementation using this ticket.
https://issues.apache.org/jira/browse/BEAM-2357
We will discuss all the small details in the PR.
The old JIRA (BEAM-1158) will still be there just to add the
better to add just the tests/docs of how to use
them as proposed in the PR (option 2).
Feel free to comment/vote or maybe add an eventual third option if you
think there is one better option.
Regards,
Ismaël Mejía
[1] https://issues.apache.org/jira/browse/BEAM-1158
Amazing milestone, congrats everyone!
On Wed, May 17, 2017 at 7:54 PM, Reuven Lax wrote:
> Sweet!
>
> On Wed, May 17, 2017 at 4:28 AM, Davor Bonaci wrote:
>
>> The first stable release is now complete!
>>
>> Release artifacts are available through
My vote, like Davor:
Slight preference toward 2.0.0, but fine with 1.0.0
On Thu, May 4, 2017 at 9:32 PM, Thomas Weise wrote:
> I'm in the relaxed 1.0.0 camp.
>
> --
> sent from mobile
> On May 4, 2017 12:29 PM, "Mingmin Xu" wrote:
>
>> I slightly prefer1.0.0
Congratulations Davor!
Your membership is really deserved, You really got the Apache spirit !
On Thu, May 4, 2017 at 5:02 PM, Thomas Groh wrote:
> Congratulations!
>
> On Thu, May 4, 2017 at 7:56 AM, Thomas Weise wrote:
>
>> Congrats!
>>
>>
>> On Thu,
Hello,
I created the HiveIO JIRA and followed the initial discussions about
the best approach for HiveIO so I want first to suggest you to read
the previous thread(s) on the mailing list.
https://www.mail-archive.com/dev@beam.incubator.apache.org/msg02313.html
The main idea I concluded from
+1 Great idea Aviem, thanks for bringing this subject to the mailing list.
I agree in particular with the freeing JIRA part, I think we shouldn’t
keep assigned JIRAs that are things that we don’t expect to solve in
the next weeks. (note the exception for this are the long features).
I would add
I have the impression this conversation went into a different
sub-discussion ignoring the core subject that is if it makes sense to
do the implementation of Passert as we are doing it right now (1), or
in a runner agnostic way (2).
Big +1 for (2).
And I think also this is critical enough to be
> For the basic “at most once” job, JStorm runner can be reused on Storm. But
> for “window”, “state” and “exactly once” job, unfortunately, JStorm runner
> can’t be reused. Anyway, we will figure out if the propagation is possible
> for Storm in the future.
>
>
>
>
Thanks Jingsong for answering, and the Streamscope ref, I am going to
check the paper, the concept of non-global-checkpointing sounds super
interesting.
It is nice that you guys are also trying to promote the move to a unified model.
Regards,
Ismaël
On Sun, Apr 2, 2017 at 3:40 PM, JingsongLee
+1
>From my previous work experience ORC in certain cases performs better
than Parquet and really deserves to be supported.
On Sat, Apr 1, 2017 at 5:58 PM, Ted Yu wrote:
> +1
>
>> On Apr 1, 2017, at 8:31 AM, Tibor Kiss wrote:
>>
>> Hello,
>>
>>
Excellent news,
Pei it would be great to have a new runner. I am curious about how
different are the implementations of storm among them considering that
there are already three 'versions': Storm, Jstorm and Heron, I wonder
if one runner could traduce to an API that would cover all of them (of
Thanks everyone, Feels great to be part of the team.
Congratulations to the other new committers !
-Ismaël
On Mon, Mar 20, 2017 at 2:50 PM, Tyler Akidau
wrote:
> Welcome!
>
> On Mon, Mar 20, 2017, 02:25 Jean-Baptiste Onofré wrote:
>
>> Welcome
This is an forgotten one, Stas did you create a JIRA about this one? I
think this change should be also tagged as First version release,
because this is an API change and can break stuff if we do it later
on.
On Wed, Jan 11, 2017 at 4:30 PM, Jean-Baptiste Onofré wrote:
> Hi
e a hand if needed.
On Thu, Mar 16, 2017 at 9:17 AM, Jason Kuster
<jasonkus...@google.com.invalid> wrote:
> Thanks Ismael for the comments! Replied inline.
>
> On Wed, Mar 15, 2017 at 8:18 AM, Ismaël Mejía <ieme...@gmail.com> wrote:
>
>> Excellent proposal, sorry to jump
r to supporting streaming with Spark 1
> runner, and having Structured Streaming advance in Spark 2, we could start
> work on Spark 2 runner in a separate branch.
>
> However, I do feel that we should use the Dataset API, starting with batch
> support first. WDYT ?
>
> On Wed,
not heavily
> investing there.
>
> We could think of starting to migrate the Spark 1 runner to Spark 2 and
> follow with Dataset API support feature-by-feature as ot advances, but I
> think most Spark installations today still run 1.X, or am I wrong ?
>
> On Wed, Mar 15, 2017
Excellent proposal, sorry to jump into this discussion so late, this
was in my toread list for almost two weeks, and I finally got the time
to read the document and I have two minor comments:
I have the impression that the strict separation of Providers (the
data-processing systems) and Resources
BIG +1 JB,
If we can just jump the version number with minor changes staying as
close as possible to the current implementation for spark 1 we can go
faster and offer in principle the exact same support but for version
2.
I know that the advanced streaming stuff based on the DataSet API
won't be
Hi, Thanks for bringing this subject to the mailing list.
+1
We definitely need a consensus on this, and I agree with your proposal and
JB’s comments modulo certain clarifications:
I think we shall go in this priority order if the version of the image we
want is available:
1. Image provided by
ose which validation to
>> unit-test and which to skip as trivial, so documentation on this topic
>> should be in the form of guidelines, high-quality example code (i.e. clean
>> up the unit tests of IOs bundled with Beam SDK), and informal knowledge in
>> the heads of readers of th
4 of which are binding:
> > * Aljoscha Krettek
> > * Davor Bonaci
> > * Ismaël Mejía
> > * Jean-Baptiste Onofré
> > * Robert Bradshaw
> > * Ted Yu
> > * Tibor Kiss
> >
> > There are no disapproving votes.
> >
> > Thanks everyone!
> >
> > Ahmet
> >
>
+0.5
I used to think that some of those tests were not worth, for example
testBuildRead and
testBuildReadAlt. However the reality is that these tests allowed me to
find bugs both during the development of HBaseIO and just yesterday when I
tried to test the write support for the emulator with
+1 (non-binding)
- verified signatures + checksums
- run mvn clean install -Prelease, all artifacts build and the tests run
smoothly (modulo some local issues I had with the installation of tox for
the python sdk, I created a PR to fix those in case other people can have
the same trouble).
Some
I found an issue too with the .md5 and sha1 files of the python release,
they refer to a different default file (a forgotten part of the renaming):
curl
https://dist.apache.org/repos/dist/dev/beam/0.6.0/apache-beam-0.6.0-python.zip.md5
7d4170e381ce0e1aa8d11bee2e63d151 apache-beam-0.6.0.zip
This
+1 to do it periodically about different subjects.
It is a good idea to have a sort of mini agenda, in the sense that the two
previous meetings had really different focus, the first one was about
contributors meeting each other and discussion of ongoing work just after
the project started on
Hello,
Thanks everyone for giving your points of view. I was waiting to see how
the conversation evolved to summarize it and continue on the open points.
Points where mostly everybody agrees (please correct me if somebody still
disagrees):
- Default metrics should not affect performance, for
This question got lost in the discussion, but there is a small improvement
that we can do:
> Just to check, are we doing parallel builds?
We are on jenkins, not in travis, there is an ongoing PR to fix this.
What we can improve is to check if we can run some of the test suites in
parallel to
Hello,
The new metrics API allows us to integrate some basic metrics into the Beam
IOs. I have been following some discussions about this on JIRAs/PRs, and I
think it is important to discuss the subject here so we can have more
awareness and obtain ideas from the community.
First I want to
Congratulations, well deserved guys !
On Fri, Jan 27, 2017 at 9:28 AM, Amit Sela wrote:
> Welcome and congratulations to all!
>
> On Fri, Jan 27, 2017, 10:12 Ahmet Altay wrote:
>
> > Thank you all! And congratulations to other new committers.
>
Similar to yesterday's discussion about opening access to the slack
channel, I wonder if it makes sense to let people assign themselves as
contributors and pick JIRAs without asking for this, Is this possible with
Apache's JIRA? And do you think this is a good idea?
On Tue, Jan 24, 2017 at 7:15
t; pre-built packages for multi-node clusters of data stores. If there's a
>>> good repository of them that we trust, that would definitely save us
>>> time.
>>> Can you point me at the mesos repository?
>>>
>>> S
>>>
>>>
>>>
>>>
701 - 744 of 744 matches
Mail list logo