Re: FileIO.write() landed on master - give it a try

2017-12-19 Thread Jean-Baptiste Onofré

Sweat !!!

Thanks Eugene.

As part of my current work on ParquetIO, I will take a look !

Regards
JB

On 12/19/2017 11:41 PM, Eugene Kirpichov wrote:

Hey all,

A while ago I proposed an API http://s.apache.org/fileio-write .
It has just landed on master https://github.com/apache/beam/pull/3817, in 
somewhat improved form compared to the initial proposal.


I think it's a cool API and I'm excited that it'll be in Beam 2.3. Please give 
it a try (e.g. by using 2.3.0-SNAPSHOT) :)


Check out some examples in the Javadoc e.g. 
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java#L242 



The main selling points are:
- It is really really easy to write files of a custom format using this API 
(example above shows how one could write List's to CSV with a header).
- The API is very Java8-friendly (much more so than the current 
DynamicDestinations APIs in TextIO/AvroIO, which I would like to deprecate in 
Beam 2.3)
- It gives a common API to use for various file-based IOs that want to get all 
the fancy features - e.g. https://github.com/apache/beam/pull/4294 shows how to 
do that with TFRecordIO and XmlIO: they previously didn't have access to 
features like dynamic destinations, and now they do: you can use 
TFRecordIO.sink() and XmlIO.sink() with FileIO.write() or writeDynamic().


Thanks to +Reuven Lax  and +Chamikara Jayalath 
 for reviews.


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Callbacks/other functions run after a PDone/output transform

2017-12-19 Thread Eugene Kirpichov
I figured out the Never.ever() approach and it seems to work. Will finish
this up and send a PR at some point. Woohoo, thanks Kenn! Seems like this
will be quite a useful transform.

On Mon, Dec 18, 2017 at 1:23 PM Eugene Kirpichov 
wrote:

> I'm a bit confused by all of these suggestions: they sound plausible at a
> high level, but I'm having a hard time making any one of them concrete.
>
> So suppose we want to create a transform Wait.on(PCollection signal):
> PCollection -> PCollection.
> a.apply(Wait.on(sig)) returns a PCollection that is mostly identical to
> "a", but buffers panes of "a" in any given window until the final pane of
> "sig" in the same window is fired (or, if it's never fired, until the
> window closes? could use a deadletter for that maybe).
>
> This transform I suppose would need to have a keyed and unkeyed version.
>
> The keyed version would support merging window fns, and would require "a"
> and "sig" to be keyed by the same key, and would work using a CoGbk -
> followed by a stateful ParDo? Or is there a way to get away without a
> stateful ParDo here? (not all runners support it)
>
> The unkeyed version would not support merging window fns. Reuven, can you
> elaborate how your combiner idea would work here - in particular, what do
> you mean by "triggering only on the final pane"? Do you mean filter
> non-final panes before entering the combiner? I wonder if that'll work,
> probably worth a shot. And Kenn, can you elaborate on "re-trigger on the
> side input with a Never.ever() trigger"?
>
> Thanks.
>
> On Sun, Dec 17, 2017 at 1:28 PM Reuven Lax  wrote:
>
>> This is an interesting point.
>>
>> In the past, we've often just though about sequencing some action to take
>> place after the sink, in which case you can simply use the sink output as a
>> main input. However if you want to run a transform with another PCollection
>> as a main input, this doesn't work. And as you've discovered, triggered
>> side inputs are defined to be non-deterministic, and there's no way to make
>> things line up.
>>
>> What you're describing only makes sense if you're blocking against the
>> final pane (since otherwise there's no reasonable way to match up somePC
>> panes with the sink panes). There are multiple ways you can do this: one
>> would be to CGBK the two PCollections together, and trigger the new
>> transform only on the final pane. Another would be to add a combiner that
>> returns a Void, triggering only on the final pane, and then make this
>> singleton Void a side input. You could also do something explicit with the
>> state API.
>>
>> Reuven
>>
>> On Fri, Dec 15, 2017 at 5:31 PM, Eugene Kirpichov 
>> wrote:
>>
>>> So this appears not as easy as anticipated (surprise!)
>>>
>>> Suppose we have a PCollection "donePanes" with an element per
>>> fully-processed pane: e.g. BigQuery sink, and elements saying "a pane of
>>> data has been written; this pane is: final / non-final".
>>>
>>> Suppose we want to use this to ensure that somePc.apply(ParDo.of(fn))
>>> happens only after the final pane has been written.
>>>
>>> In other words: we want a.apply(ParDo.of(b).withSideInput(c)) to happen
>>> when c emits a *final* pane.
>>>
>>> Unfortunately, using
>>> ParDo.of(fn).withSideInputs(donePanes.apply(View.asSingleton())) doesn't do
>>> the trick: the side input becomes ready the moment *the first *pane of
>>> data has been written.
>>>
>>> But neither does ParDo.of(fn).withSideInputs(donePanes.apply(...filter
>>> only final panes...).apply(View.asSingleton())). It also becomes ready the
>>> moment *the first* pane has been written, you just get an exception if
>>> you access the side input before the *final* pane was written.
>>>
>>> I can't think of a pure-Beam solution to this: either "donePanes" will
>>> be used as a main input to something (and then everything else can only be
>>> a side input, which is not general enough), or it will be used as a side
>>> input (and then we can't achieve "trigger only after the final pane fires").
>>>
>>> It seems that we need a way to control the side input pushback, and
>>> configure whether a view becomes ready when its first pane has fired or
>>> when its last pane has fired. I could see this be a property on the View
>>> transform itself. In terms of implementation - I tried to figure out how
>>> side input readiness is determined, in the direct runner and Dataflow
>>> runner, and I'm completely lost and would appreciate some help.
>>>
>>> On Thu, Dec 7, 2017 at 12:01 AM Reuven Lax  wrote:
>>>
 This sounds great!

 On Mon, Dec 4, 2017 at 4:34 PM, Ben Chambers 
 wrote:

> This would be absolutely great! It seems somewhat similar to the
> changes that were made to the BigQuery sink to support WriteResult (
> 

[DISCUSS] Guidelines for merging new runners and SDKs into master

2017-12-19 Thread Henning Rohde
Hi everyone,

 As part of the Go SDK development, I was looking at the guidelines for
merging new runners and SDKs into master [1] and I think they would benefit
from being updated to reflect the emerging portability framework. Specific
suggestions:

(1) Both runners and SDKs should support the portability framework (to the
extent the model is supported by the runner/SDK). It would be
counter-productive at this time for the ecosystem to go against that effort
without a compelling reason. Direct runners not included.

(2) What are the minimal set of IO connectors a new SDK must support? Given
the upcoming cross-language feature in the portability framework, can we
rely on that to meet the requirement
without implementing any native IO connectors?

(3) Similarly to new runners, new SDKs should handle at least a useful
subset of the model, but not necessarily the whole model (at the time of
merge). A global-window-batch-only SDK targeting the portability framework,
for example, could be as useful a contribution in master as a full model
SDK that is supported by a direct runner only. Of course, this is not to
say that SDKs should not strive to support the full model, but rather --
like Python streaming -- that it's fine to pursue that goal in master
beyond a certain point. That said, I'm curious as to why this guideline for
SDKs was set that specifically originally.

Finally, while portability support for various features -- such as side
input, cross-language I/O and the reference runner -- is still underway,
what should the guidelines be? For the Go SDK specifically, if in master,
it would bring the additional utility of helping test the portability
framework as it's being developed. On the other hand, it can't support
features that do not yet exist.

What do you all think?

Thanks,
 Henning

[1] https://beam.apache.org/contribute/feature-branches/


FileIO.write() landed on master - give it a try

2017-12-19 Thread Eugene Kirpichov
Hey all,

A while ago I proposed an API http://s.apache.org/fileio-write .
It has just landed on master https://github.com/apache/beam/pull/3817, in
somewhat improved form compared to the initial proposal.

I think it's a cool API and I'm excited that it'll be in Beam 2.3. Please
give it a try (e.g. by using 2.3.0-SNAPSHOT) :)

Check out some examples in the Javadoc e.g.
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java#L242


The main selling points are:
- It is really really easy to write files of a custom format using this API
(example above shows how one could write List's to CSV with a
header).
- The API is very Java8-friendly (much more so than the current
DynamicDestinations APIs in TextIO/AvroIO, which I would like to deprecate
in Beam 2.3)
- It gives a common API to use for various file-based IOs that want to get
all the fancy features - e.g. https://github.com/apache/beam/pull/4294 shows
how to do that with TFRecordIO and XmlIO: they previously didn't have
access to features like dynamic destinations, and now they do: you can use
TFRecordIO.sink() and XmlIO.sink() with FileIO.write() or writeDynamic().

Thanks to +Reuven Lax  and +Chamikara Jayalath
 for reviews.


Re: remove avro from core?

2017-12-19 Thread Jean-Baptiste Onofré

Hi Ismaël,

1. We can always use license and version Maven plugin to check that. However, 
it's under the responsibility of the PMC to enforce security/CVE/legal points.


2. Agree, CVE (via secur...@apache.org) should be raised to Avro.

Take a look on https://www.apache.org/security/

Regards
JB

On 12/19/2017 11:28 AM, Ismaël Mejía wrote:

There are two important points raised here that we are missing in the
discussion:

1. We don’t have any kind of security validation for Beam and its
dependencies. This is important and it would be really nice if we
could get this improved and some sort of automation on this. However I
doubt there is a free service for this. Does the ASF has something for
this? or some of the companies involved in the project could help us
setup this?

Romain, can you share the security report with the Beam community so
we can be aware of this an other security issues at least for the
moment?

2. The issues raised on Avro security should also be raised into the
Avro community. Beam is using the latest Avro release so in part Beam
cannot be blamed of dependency negligence since we follow the latest
Avro release.

Are the issues about Avro itself or about the Jackson version because
if it is this the case I don’t think the things will move quickly:
https://issues.apache.org/jira/browse/AVRO-1126


On Tue, Dec 19, 2017 at 11:23 AM, Jean-Baptiste Onofré  
wrote:

Agree, it's what I meant by "core transforms".

Regards
JB

On 12/19/2017 11:18 AM, Reuven Lax wrote:


Keep in mind that today Avro is one of the most common coders used for
user data types, not just for file IO. The reason for this is that it's the
easiest way to get a coder for a users POJO - you simply annotate the POJO
with @DefaultCoder(AvroCoder.class), and it works. This is the coder used
for all internal shuffles (e.g. GroupByKey).

I would argue that most users don't really care about Avro for this use
case, what they really want is a way of saying "make this POJO work" and
Avro is the only way we give them. This was part of my argument in the
schema docs. However the status quo is that they use Avro here.

Reuven

On Tue, Dec 19, 2017 at 1:32 AM, Jean-Baptiste Onofré > wrote:

 Hi Romain,

 it sounds good to me. I think any format should be packaged as an
extension.

 The only point is that some core transforms expect specific format,
so, it
 means that users will have to remember to add the avro extension to
use some
 transforms (or the transforms could be an extension as well). I have
to
 check the transforms working like this.

 Regards
 JB

 On 12/19/2017 10:26 AM, Romain Manni-Bucau wrote:

 Hi guys,

 checking security issues of the project I'm responsible of (which
 integrates beam) I realized the java sdk core module depends on
avro. On
 security point of view it is a blocker cause of the legacy avro
brings
 (jackson from codehaus etc) but all that can be fixed. However I
would
 like to take this opportunity to open the topic of avro in the
core
 dependencies.

   From my point of view it doesn't make much sense cause it is
just one
 of the serialization you can use with the file IO and it is highly
not
 probable all the potential formats are imported in the core. Since
it is
 a very local usage and not a core feature I think it should be
extracted
 - we can discuss extracting the actual transforms from the core in
 another thread, it would make a lot of sense IMHO but not the
current topic.

 Therefore I'd like to propose to extract avro format - like others
- in
 an extension and remove it as a hard requirement of the core to
bring
 more consistency and modularity to beam.

 Wdyt?

 Romain Manni-Bucau
 @rmannibucau > | Blog
 > | Old Blog
 > |
 Github > | LinkedIn
 >


 -- Jean-Baptiste Onofré
 jbono...@apache.org 
 http://blog.nanthrax.net
 Talend - http://www.talend.com




--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: remove avro from core?

2017-12-19 Thread Ismaël Mejía
There are two important points raised here that we are missing in the
discussion:

1. We don’t have any kind of security validation for Beam and its
dependencies. This is important and it would be really nice if we
could get this improved and some sort of automation on this. However I
doubt there is a free service for this. Does the ASF has something for
this? or some of the companies involved in the project could help us
setup this?

Romain, can you share the security report with the Beam community so
we can be aware of this an other security issues at least for the
moment?

2. The issues raised on Avro security should also be raised into the
Avro community. Beam is using the latest Avro release so in part Beam
cannot be blamed of dependency negligence since we follow the latest
Avro release.

Are the issues about Avro itself or about the Jackson version because
if it is this the case I don’t think the things will move quickly:
https://issues.apache.org/jira/browse/AVRO-1126


On Tue, Dec 19, 2017 at 11:23 AM, Jean-Baptiste Onofré  
wrote:
> Agree, it's what I meant by "core transforms".
>
> Regards
> JB
>
> On 12/19/2017 11:18 AM, Reuven Lax wrote:
>>
>> Keep in mind that today Avro is one of the most common coders used for
>> user data types, not just for file IO. The reason for this is that it's the
>> easiest way to get a coder for a users POJO - you simply annotate the POJO
>> with @DefaultCoder(AvroCoder.class), and it works. This is the coder used
>> for all internal shuffles (e.g. GroupByKey).
>>
>> I would argue that most users don't really care about Avro for this use
>> case, what they really want is a way of saying "make this POJO work" and
>> Avro is the only way we give them. This was part of my argument in the
>> schema docs. However the status quo is that they use Avro here.
>>
>> Reuven
>>
>> On Tue, Dec 19, 2017 at 1:32 AM, Jean-Baptiste Onofré > > wrote:
>>
>> Hi Romain,
>>
>> it sounds good to me. I think any format should be packaged as an
>> extension.
>>
>> The only point is that some core transforms expect specific format,
>> so, it
>> means that users will have to remember to add the avro extension to
>> use some
>> transforms (or the transforms could be an extension as well). I have
>> to
>> check the transforms working like this.
>>
>> Regards
>> JB
>>
>> On 12/19/2017 10:26 AM, Romain Manni-Bucau wrote:
>>
>> Hi guys,
>>
>> checking security issues of the project I'm responsible of (which
>> integrates beam) I realized the java sdk core module depends on
>> avro. On
>> security point of view it is a blocker cause of the legacy avro
>> brings
>> (jackson from codehaus etc) but all that can be fixed. However I
>> would
>> like to take this opportunity to open the topic of avro in the
>> core
>> dependencies.
>>
>>   From my point of view it doesn't make much sense cause it is
>> just one
>> of the serialization you can use with the file IO and it is highly
>> not
>> probable all the potential formats are imported in the core. Since
>> it is
>> a very local usage and not a core feature I think it should be
>> extracted
>> - we can discuss extracting the actual transforms from the core in
>> another thread, it would make a lot of sense IMHO but not the
>> current topic.
>>
>> Therefore I'd like to propose to extract avro format - like others
>> - in
>> an extension and remove it as a hard requirement of the core to
>> bring
>> more consistency and modularity to beam.
>>
>> Wdyt?
>>
>> Romain Manni-Bucau
>> @rmannibucau > > | Blog
>> > > | Old Blog
>> > > |
>> Github > > | LinkedIn
>> > >
>>
>>
>> -- Jean-Baptiste Onofré
>> jbono...@apache.org 
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com


Re: remove avro from core?

2017-12-19 Thread Jean-Baptiste Onofré

Agree, it's what I meant by "core transforms".

Regards
JB

On 12/19/2017 11:18 AM, Reuven Lax wrote:
Keep in mind that today Avro is one of the most common coders used for user data 
types, not just for file IO. The reason for this is that it's the easiest way to 
get a coder for a users POJO - you simply annotate the POJO with 
@DefaultCoder(AvroCoder.class), and it works. This is the coder used for all 
internal shuffles (e.g. GroupByKey).


I would argue that most users don't really care about Avro for this use case, 
what they really want is a way of saying "make this POJO work" and Avro is the 
only way we give them. This was part of my argument in the schema docs. However 
the status quo is that they use Avro here.


Reuven

On Tue, Dec 19, 2017 at 1:32 AM, Jean-Baptiste Onofré > wrote:


Hi Romain,

it sounds good to me. I think any format should be packaged as an extension.

The only point is that some core transforms expect specific format, so, it
means that users will have to remember to add the avro extension to use some
transforms (or the transforms could be an extension as well). I have to
check the transforms working like this.

Regards
JB

On 12/19/2017 10:26 AM, Romain Manni-Bucau wrote:

Hi guys,

checking security issues of the project I'm responsible of (which
integrates beam) I realized the java sdk core module depends on avro. On
security point of view it is a blocker cause of the legacy avro brings
(jackson from codehaus etc) but all that can be fixed. However I would
like to take this opportunity to open the topic of avro in the core
dependencies.

  From my point of view it doesn't make much sense cause it is just one
of the serialization you can use with the file IO and it is highly not
probable all the potential formats are imported in the core. Since it is
a very local usage and not a core feature I think it should be extracted
- we can discuss extracting the actual transforms from the core in
another thread, it would make a lot of sense IMHO but not the current 
topic.

Therefore I'd like to propose to extract avro format - like others - in
an extension and remove it as a hard requirement of the core to bring
more consistency and modularity to beam.

Wdyt?

Romain Manni-Bucau
@rmannibucau > | Blog
> | Old Blog
> |
Github > | LinkedIn
>


-- 
Jean-Baptiste Onofré

jbono...@apache.org 
http://blog.nanthrax.net
Talend - http://www.talend.com




--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


remove avro from core?

2017-12-19 Thread Romain Manni-Bucau
Hi guys,

checking security issues of the project I'm responsible of (which
integrates beam) I realized the java sdk core module depends on avro. On
security point of view it is a blocker cause of the legacy avro brings
(jackson from codehaus etc) but all that can be fixed. However I would like
to take this opportunity to open the topic of avro in the core dependencies.

>From my point of view it doesn't make much sense cause it is just one of
the serialization you can use with the file IO and it is highly not
probable all the potential formats are imported in the core. Since it is a
very local usage and not a core feature I think it should be extracted - we
can discuss extracting the actual transforms from the core in another
thread, it would make a lot of sense IMHO but not the current topic.

Therefore I'd like to propose to extract avro format - like others - in an
extension and remove it as a hard requirement of the core to bring more
consistency and modularity to beam.

Wdyt?

Romain Manni-Bucau
@rmannibucau  |  Blog
 | Old Blog
 | Github  |
LinkedIn 


Build failed in Jenkins: beam_Release_NightlySnapshot #626

2017-12-19 Thread Apache Jenkins Server
See 


Changes:

[robertwb] [BEAM-3183] Add runner.run(transform) to Python SDK.

[jbonofre] [BEAM-1920] Upgrade to Spark runner to Spark 2.2.1

[jbonofre] [BEAM-3340] Update Flink Runner to Flink 1.4.0

[robertwb] [BEAM-3143] Type Inference Python 3 Compatibility (#4183)

[robertwb] Fix getitem and list comprehension type inference.

[lcwik] [BEAM-2929] Remove Dataflow expansions for PCollectionView that have

[github] Guard against closing data channel twice (#4283)

--
[...truncated 3.10 MB...]
2017-12-19T08:22:09.471 [INFO] Excluding 
org.json4s:json4s-jackson_2.11:jar:3.2.11 from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding org.json4s:json4s-core_2.11:jar:3.2.11 
from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding org.json4s:json4s-ast_2.11:jar:3.2.11 
from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding org.scala-lang:scalap:jar:2.11.0 from 
the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding 
org.scala-lang:scala-compiler:jar:2.11.0 from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding 
org.scala-lang.modules:scala-xml_2.11:jar:1.0.1 from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding 
org.glassfish.jersey.core:jersey-client:jar:2.22.2 from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding javax.ws.rs:javax.ws.rs-api:jar:2.0.1 
from the shaded jar.
2017-12-19T08:22:09.471 [INFO] Excluding 
org.glassfish.hk2:hk2-api:jar:2.4.0-b34 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.hk2:hk2-utils:jar:2.4.0-b34 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.hk2.external:aopalliance-repackaged:jar:2.4.0-b34 from the shaded 
jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.hk2.external:javax.inject:jar:2.4.0-b34 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.hk2:hk2-locator:jar:2.4.0-b34 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.jersey.core:jersey-common:jar:2.22.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
javax.annotation:javax.annotation-api:jar:1.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.jersey.bundles.repackaged:jersey-guava:jar:2.22.2 from the shaded 
jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.hk2:osgi-resource-locator:jar:1.0.1 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.jersey.core:jersey-server:jar:2.22.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.jersey.media:jersey-media-jaxb:jar:2.22.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.jersey.containers:jersey-container-servlet:jar:2.22.2 from the 
shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.22.2 from 
the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding io.netty:netty-all:jar:4.0.43.Final 
from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding io.netty:netty:jar:3.9.9.Final from 
the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
io.dropwizard.metrics:metrics-jvm:jar:3.1.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
io.dropwizard.metrics:metrics-json:jar:3.1.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
io.dropwizard.metrics:metrics-graphite:jar:3.1.2 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding org.apache.ivy:ivy:jar:2.4.0 from the 
shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding oro:oro:jar:2.0.8 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding net.razorvine:pyrolite:jar:4.13 from 
the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding net.sf.py4j:py4j:jar:0.10.4 from the 
shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.apache.spark:spark-tags_2.11:jar:2.2.1 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.apache.commons:commons-crypto:jar:1.0.0 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.spark-project.spark:unused:jar:1.0.0 from the shaded jar.
2017-12-19T08:22:09.472 [INFO] Excluding 
org.apache.spark:spark-streaming_2.11:jar:2.2.1 from the shaded jar.
2017-12-19T08:22:11.300 [INFO] Replacing original artifact with shaded artifact.
2017-12-19T08:22:11.406 [INFO] 
2017-12-19T08:22:11.406 [INFO] --- maven-javadoc-plugin:3.0.0-M1:jar 
(attach-javadocs) @ beam-sdks-java-javadoc ---
2017-12-19T08:22:11.410 [INFO] Not executing Javadoc as the project is not a 
Java classpath-capable package
2017-12-19T08:22:11.518 [INFO] 
2017-12-19T08:22:11.518 [INFO] --- maven-source-plugin:3.0.1:jar-no-fork 
(attach-sources) @ beam-sdks-java-javadoc ---
2017-12-19T08:22:11.625 [INFO] 
2017-12-19T08:22:11.625 [INFO] --- maven-source-plugin:3.0.1:test-jar-no-fork 
(attach-test-sources) @ beam-sdks-java-javadoc ---
2017-12-19T08:22:11.731 [INFO]