mvnrepository doesn't matter (it will sync later), the actual Central URL is:
http://repo.maven.apache.org/maven2/org/apache/beam/beam-sdks-java-core/
And 2.2.0 is there.
Regards
JB
On 11/25/2017 08:01 PM, Reuven Lax wrote:
BTW,
It's been over a day, and I still don't see 2.2.0 listed at
https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-core How
long does it usually take to promote the artifacts here?
On Fri, Nov 24, 2017 at 3:43 PM, Reuven Lax <[email protected]> wrote:
Appears to be a problem :)
I tried publishing the latest artifact from Apache Nexus to Maven Central.
After clicking publish, Nexus claimed that the operation has completed.
However a look at the Maven Central page (https://mvnrepository.com/
artifact/org.apache.beam/beam-sdks-java-core) does not show 2.2.0
artifacts, and the staging repository has now vanished from the Nexus site!
Does anyone know what happened here?
Reuven
On Wed, Nov 22, 2017 at 11:04 PM, Thomas Weise <[email protected]> wrote:
+1
Run quickstart with Apex runner in embedded mode and on YARN.
It needed couple tweaks to get there though.
1) Change quickstart pom.xml apex-runner profile:
<!--
Apex 3.6 is built against YARN 2.6. For this fat jar, the
included version has to match what's on the cluster,
hence we need to repeat the Apex Hadoop dependencies at the
required version here.
-->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>${hadoop.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<scope>runtime</scope>
</dependency>
2) After copying the fat jar to the cluster:
java -cp word-count-beam-bundled-0.1.jar org.apache.beam.examples.WordC
ount
\
--inputFile=file:///tmp/input.txt --output=/tmp/counts
--embeddedExecution=false --configFile=beam-runners-apex.properties
--runner=ApexRunner
(this was on a single node cluster, hence the local file path)
The quickstart instructions suggest to use *mvn exec:java* instead of
*java*
- it generally isn't valid to assume that mvn and a build environment
exists on the edge node of a YARN cluster.
On Wed, Nov 22, 2017 at 2:12 PM, Nishu <[email protected]> wrote:
Hi Eugene,
I ran it on both standalone flink(non Yarn) and Flink on HDInsight
Cluster(Yarn). Both ran successfully. :)
Regards,
Nishu
<https://www.avast.com/sig-email?utm_medium=email&utm_
source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_
source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
On Wed, Nov 22, 2017 at 9:40 PM, Eugene Kirpichov <
[email protected]> wrote:
Thanks Nishu. So, if I understand correctly, your pipelines were
running
on
non-YARN, but you're planning to run with YARN?
I meanwhile was able to get Flink running on Dataproc (YARN), and
validated
quickstart and game examples.
At this point we need validation for Spark and Flink non-YARN [I
think if
Nishu's runs were non-YARN, they'd give us enough confidence, combined
with
the success of other validations of Spark and Flink runners?], and
Apex
on
YARN. However, it seems that in previous RCs we were not validating
Apex
on
YARN, only local cluster. Is it needed this time?
On Wed, Nov 22, 2017 at 12:28 PM Nishu <[email protected]> wrote:
Hi Eugene,
No, I didn't try with those instead I have my custom pipeline where
Kafka
topic is the source. I have defined a Global Window and processing
time
trigger to read the data. Further it runs some transformation i.e.
GroupByKey and CoGroupByKey. on the windowed collections.
I was running the same pipeline on direct runner and spark runner
earlier..
Today gave it a try with Flink on Yarn.
Best Regards,
Nishu.
<
https://www.avast.com/sig-email?utm_medium=email&utm_
source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon
Virus-free.
www.avast.com
<
https://www.avast.com/sig-email?utm_medium=email&utm_
source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
On Wed, Nov 22, 2017 at 8:07 PM, Eugene Kirpichov <
[email protected]> wrote:
Thanks Nishu! Can you clarify which pipeline you were running?
The validation spreadsheet includes 1) the quickstart and 2)
mobile
game
walkthroughs. Was it one of these, or your custom pipeline?
On Wed, Nov 22, 2017 at 10:20 AM Nishu <[email protected]>
wrote:
Hi,
Typo in previous mail. I meant Flink runner.
Thanks,
Nishu
On Wed, 22 Nov 2017 at 19.17,
Hi,
I build a pipeline using RC 2.2 today and ran with runner on
yarn.
It worked seamlessly for unbounded sources. Couldn’t see any
issues
with
my pipeline so far :)
Thanks,Nishu
On Wed, 22 Nov 2017 at 18.57, Reuven Lax
<[email protected]
wrote:
Who is validating Flink and Yarn?
On Tue, Nov 21, 2017 at 9:26 AM, Kenneth Knowles
<[email protected]
wrote:
On Mon, Nov 20, 2017 at 5:01 PM, Eugene Kirpichov <
[email protected]> wrote:
In the verification spreadsheet, I'm not sure I
understand
the
difference
between the "YARN" and "Standalone cluster/service".
Which
is
Dataproc?
It
definitely uses YARN, but it is also a standalone
cluster/service.
Does
it
count for both?
No, it doesn't. A number of runners have their own non-YARN
cluster
mode. I
would expect that the launching experience might be
different
and
the
portable container management to differ. If they are
identical,
experts
in
those systems should feel free to coalesce the rows.
Conversely,
as
other
platforms become supported, they could be added or not
based
on
whether
they are substantively different from a user experience or
QA
point
of
view.
Kenn
Seems now we're missing just Apex and Flink cluster
verifications.
*though Spark runner took 6x longer to run UserScore,
partially
I
guess
because it didn't do autoscaling (Dataflow runner ramped
up
to 5
workers
whereas Spark runner used 2 workers). For some reason
Spark
runner
chose
not to split the 10GB input files into chunks.
On Mon, Nov 20, 2017 at 3:46 PM Reuven Lax
<[email protected]
wrote:
Done
On Tue, Nov 21, 2017 at 3:08 AM, Robert Bradshaw <
[email protected]> wrote:
Thanks. You need to re-sign as well.
On Mon, Nov 20, 2017 at 12:14 AM, Reuven Lax
<[email protected]
wrote:
FYI these generated files have been removed from
the
source
distribution.
On Sat, Nov 18, 2017 at 9:09 AM, Reuven Lax <
[email protected]
wrote:
hmmm, I thought I removed those generated files
from
the
zip
file
before
sending this email. Let me check again.
Reuven
On Sat, Nov 18, 2017 at 8:52 AM, Robert Bradshaw <
[email protected]> wrote:
The source distribution contains a couple of
files
not
on
github
(e.g.
folders that were added on master, Python
generated
files).
The
pom
files differed only by missing -SNAPSHOT, other
than
that
presumably
the source release should just be "wget
https://github.com/apache/beam/archive/release-2.2.0.zip
"?
diff -rq apache-beam-2.2.0 beam/ | grep -v
pom.xml
# OK?
Only in apache-beam-2.2.0: DEPENDENCIES
# Expected.
Only in beam/: .git
Only in beam/: .gitattributes
Only in beam/: .gitignore
# These folders are probably from switching
around
between
master
and
git branches.
Only in apache-beam-2.2.0: model
Only in apache-beam-2.2.0/runners/flink:
examples
Only in apache-beam-2.2.0/runners/flink: runner
Only in apache-beam-2.2.0/runners/gearpump:
jarstore
Only in apache-beam-2.2.0/sdks/java/extensions:
gcp-core
Only in apache-beam-2.2.0/sdks/java/extensions:
sketching
Only in apache-beam-2.2.0/sdks/java/io:
file-based-io-tests
Only in apache-beam-2.2.0/sdks/java/io: hdfs
Only in apache-beam-2.2.0/sdks/java/
maven-archetypes/examples/src/
ma
in/resources/archetype-resources:
src
Only in apache-beam-2.2.0/sdks/java/
maven-archetypes/examples-
java8/
src/main/resources/archetype-resources:
src
Only in apache-beam-2.2.0/sdks/java:
microbenchmarks
# Here's the generated protos.
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_artifact_api_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_artifact_api_pb2.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_fn_api_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_fn_api_pb2.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_job_api_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_job_api_pb2.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_provision_api_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_provision_api_pb2.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_runner_api_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
beam_runner_api_pb2.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
endpoints_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
endpoints_pb2.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
standard_window_fns_pb2_grpc.py
Only in apache-beam-2.2.0/sdks/python/
apache_beam/portability/api:
standard_window_fns_pb2.py
And some other sdist generated Python files.
Only in apache-beam-2.2.0/sdks/python: .eggs
Only in apache-beam-2.2.0/sdks/python: LICENSE
Only in apache-beam-2.2.0/sdks/python: NOTICE
Only in apache-beam-2.2.0/sdks/python: README.md
Presumably we should just purge these files from
the
rc?
FWIW, the Python tarball looks fine.
On Fri, Nov 17, 2017 at 4:40 PM, Eugene Kirpichov
<[email protected]> wrote:
How can I specify a dependency on the staged
RC?
E.g.
I'm
trying
to
validate the quickstart per
https://beam.apache.org/get-
started/quickstart-java/
and
specifying
version
2.2.0 doesn't work I suppose because it's not
released
yet.
Should
I
pass
some command-line flag to mvn to make it fetch
the
version
from
the
staging
area?
On Fri, Nov 17, 2017 at 4:37 PM Lukasz Cwik
<[email protected]
wrote:
Its open to all, its just that there are
binding
votes
and
non-binding
votes.
On Fri, Nov 17, 2017 at 4:26 PM, Valentyn
Tymofieiev
<
[email protected]> wrote:
I have a process question: is the vote open
for
committers
only
or
for
all
contributors?
On Fri, Nov 17, 2017 at 4:06 PM, Lukasz Cwik
<[email protected]>
wrote:
+1, Approve the release
I have verified the wordcount quickstart
on
the
Apache
Beam
website
using
Apex, DirectRunner, Flink & Spark on
Linux.
The Gearpump runner is yet to have a
quickstart
listed
on
our
website.
Adding the quickstart is already
represented
by
this
existing
issue:
https://issues.apache.org/
jira/browse/BEAM-2692
On Fri, Nov 17, 2017 at 11:50 AM, Valentyn
Tymofieiev <
[email protected]> wrote:
I have verified: SHA & MD5 signatures of
Python
artifacts
in
[2], and
checked Python side of the validation
checklist
on
Linux.
There is one known issue in UserScore
example
for
Dataflow
runner.
The
issue has been fixed on master branch
and
does
not
require a
cherry-pick
at
this point. A workaround is to pass
--save_main_session
pipeline
option.
RC4 looks good to me so far.
On Fri, Nov 17, 2017 at 11:30 AM,
Kenneth
Knowles
<[email protected]
wrote:
Hi all,
Following up on past discussions and
https://issues.apache.org/
jira/browse/BEAM-1189
I
have
prepared a
spreadsheet so we can sign up for
validation
steps
that
must be
done
by a
human.
The spreadsheet for 2.2.0 is at
https://s.apache.org/beam-2.2.0-release-validation
.
Everyone
can
edit,
so
it is easy to sign up or add new rows.
FYI the template is at
https://s.apache.org/beam-release-validation.
After
this release, I will update the rows
in
the
template
to
match
whatever
we
have added to the validation for
2.2.0.
Kenn
On Thu, Nov 16, 2017 at 10:08 PM,
Reuven
Lax
<[email protected]
wrote:
Hi everyone,
Please review and vote on the
release
candidate
#4
for
the
version
2.2.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release
(please
provide
specific
comments)
The complete staging area is
available
for
your
review,
which
includes:
* JIRA release notes [1],
* the official Apache source
release
to
be
deployed
to
dist.apache.org
[2],
which is signed with the key with
fingerprint
B98B7708
[3],
* all artifacts to be deployed to
the
Maven
Central
Repository
[4],
* source code tag "v2.2.0-RC4"
[5],
* website pull request listing the
release
and
publishing
the
API
reference manual [6].
* Java artifacts were built with
Maven
3.5.0
and
OpenJDK/Oracle
JDK
1.8.0_144.
* Python artifacts are deployed
along
with
the
source
release
to
the
dist.apache.org [2].
The vote will be open for at least
72
hours.
It
is
adopted by
majority
approval, with at least 3 PMC
affirmative
votes.
Thanks,
Reuven
[1] https://issues.apache.org/
jira/secure/ReleaseNote.jspa?p
rojectId=12319527&version=12341044
[2] https://dist.apache.org/repos/
dist/dev/beam/2.2.0/
[3]
https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/
content/repositories/
orgapachebeam-1025/
[5] https://github.com/apache/
beam/tree/v2.2.0-RC4
<
https://github.com/apache/beam/tree/v2.2.0-RC4>
[6]
https://github.com/apache/beam-site/pull/337
--
Thanks & Regards,
Nishu Tayal
<
https://www.avast.com/sig-email?utm_medium=email&utm_
source=link&utm_campaign=sig-email&utm_content=webmail&utm_t
erm=icon
Virus-free.
www.avast.com
<
https://www.avast.com/sig-email?utm_medium=email&utm_
source=link&utm_campaign=sig-email&utm_content=webmail&utm_t
erm=link
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
--
Thanks & Regards,
Nishu Tayal
--
Thanks & Regards,
Nishu Tayal
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com