[jira] [Commented] (BEAM-8388) Update Avro to 1.9.1 from 1.8.2

2020-01-15 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016510#comment-17016510
 ] 

Aaron Dixon commented on BEAM-8388:
---

[~suztomo] 2.17.0 references a class `TimestampConversion` that is only in Avro 
1.8.x. It appears that the PR you referenced (that didn't get merged) shows the 
required change away from using `TimestampConversion`: 
[https://github.com/apache/beam/pull/9779/files]

Regarding Ken's comment – wouldn't shading Avro be an apropos solution? This is 
a very common approach in dealing with situations like this. (See 
[https://maven.apache.org/plugins/maven-shade-plugin/]) In this way Beam could 
(internally, for AvroCoder, etc) use whichever Avro version it prefers while 
clients could then have direct dependencies on any Avro version and not 
conflict with Beam's?

 

 

> Update Avro to 1.9.1 from 1.8.2
> ---
>
> Key: BEAM-8388
> URL: https://issues.apache.org/jira/browse/BEAM-8388
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-avro
>Reporter: Jordanna Chord
>Assignee: Jordanna Chord
>Priority: Major
>   Original Estimate: 24h
>  Time Spent: 3h 20m
>  Remaining Estimate: 20h 40m
>
> Update build dependency to 1.9.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8388) Update Avro to 1.9.1 from 1.8.2

2020-01-15 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016525#comment-17016525
 ] 

Aaron Dixon commented on BEAM-8388:
---

More context just in case it is helpful: I didn't discover this first through 
the PR. I upgraded to Beam 2.17.0 and ran my job (in Dataflow) and got the slew 
of ClassNotFoundExceptions for TimestampConversion coming from AvroCoder..

> Update Avro to 1.9.1 from 1.8.2
> ---
>
> Key: BEAM-8388
> URL: https://issues.apache.org/jira/browse/BEAM-8388
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-avro
>Reporter: Jordanna Chord
>Assignee: Jordanna Chord
>Priority: Major
>   Original Estimate: 24h
>  Time Spent: 3h 20m
>  Remaining Estimate: 20h 40m
>
> Update build dependency to 1.9.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8388) Update Avro to 1.9.1 from 1.8.2

2020-01-15 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016476#comment-17016476
 ] 

Aaron Dixon commented on BEAM-8388:
---

I'd like to add a vote to this ticket with some context:

My org must use Avro 1.9.x (due to some Avro schema resolution issues resolved 
in 1.9.x) so downgrading Avro is not possible for us.

Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are not 
available in 1.9.x.

So we are stuck on 2.16.0 until Beam can eliminate the hard dependency on 
1.8.x. I see that there is [this now closed 
PR|[https://github.com/apache/beam/pull/9779]] that hoped to address this.

Would shading Avro (maven-shade-plugin) be an option for Beam here - so that 
clients could use their own version of Avro. (For context, for us, we use Avro 
to parse events from a KafkaIO source; it would be perfectly fine if Beam 
internally used a shaded 1.8.x version to do its own coding/serde.)

> Update Avro to 1.9.1 from 1.8.2
> ---
>
> Key: BEAM-8388
> URL: https://issues.apache.org/jira/browse/BEAM-8388
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-avro
>Reporter: Jordanna Chord
>Assignee: Jordanna Chord
>Priority: Major
>   Original Estimate: 24h
>  Time Spent: 3h 20m
>  Remaining Estimate: 20h 40m
>
> Update build dependency to 1.9.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-17 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018400#comment-17018400
 ] 

Aaron Dixon commented on BEAM-9144:
---

[~iemejia] Thanks, I'd like to help test. I'm relatively new to Beam and very 
new to its dev/build etc. Will it be sufficient for me to pull 2.20.0-SNAPSHOTS 
from apache snapshots repo 
([https://repository.apache.org/content/repositories/snapshots/])? And I assume 
I'll need to wait til tomorrow so that this work is incorporated in a nightly?

If so, this weekend I can run my Dataflow pipeline against 2.20.0-SNAPHOTS from 
apache/snapshots repo and report back that it is happy. FYI/fwiw, my specific 
Beam dependencies are these, so I'll be testing against these 2.20.0-SNAPSHOTS:
{code:java}
[org.apache.beam/beam-sdks-java-core]
[org.apache.beam/beam-sdks-java-io-kafka]
[org.apache.beam/beam-runners-google-cloud-dataflow-java]{code}

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-17 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018480#comment-17018480
 ] 

Aaron Dixon commented on BEAM-9144:
---

Perfect. Thanks [~suztomo], will report back

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-22 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021450#comment-17021450
 ] 

Aaron Dixon commented on BEAM-9144:
---

[~suztomo] I've verified that my Dataflow/KafkaIO pipeline (using Avro 1.9.x) 
can now make progress ---I used the latest nightly 2.20.0 SNAPSHOT and your 
pre-built worker jar you published (thanks for that!).

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: NoClassDefFoundError in word-count-beam.png, 
> avro-beam-dependency-graph.png, dataflow-not-finish.png, 
> dataflowWorkerJar_succeeded.png, dataflow_step_job_id_OBFUSC-0.json
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8388) Update Avro to 1.9.1 from 1.8.2

2020-01-16 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017519#comment-17017519
 ] 

Aaron Dixon commented on BEAM-8388:
---

Just to add more context/help here. The exception I see after upgrading from 
Beam 2.16->2.17 is this (running in Dataflow):

```
java.lang.NoClassDefFoundError: 
org/apache/avro/data/TimeConversions$TimestampConversion # 
 ## at org.apache.beam.sdk.coders.AvroCoder. 
([AvroCoder.java:269|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.sdk.coders%2FAvroCoder.java=269=flow-p1-9c38=1])
 ## at org.apache.beam.sdk.coders.AvroCoder.of 
([AvroCoder.java:121|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.sdk.coders%2FAvroCoder.java=121=flow-p1-9c38=1])
 ## at org.apache.beam.sdk.io.kafka.KafkaUnboundedSource.getCheckpointMarkCoder 
([KafkaUnboundedSource.java:131|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.sdk.io.kafka%2FKafkaUnboundedSource.java=131=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$UnboundedReader.iterator
 
([WorkerCustomSources.java:419|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker%2FWorkerCustomSources.java=419=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop
 
([ReadOperation.java:178|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker.util.common.worker%2FReadOperation.java=178=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start 
([ReadOperation.java:159|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker.util.common.worker%2FReadOperation.java=159=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute
 
([MapTaskExecutor.java:77|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker.util.common.worker%2FMapTaskExecutor.java=77=flow-p1-9c38=1])
 ## at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process 
([StreamingDataflowWorker.java:1320|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker%2FStreamingDataflowWorker.java=1320=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000 
([StreamingDataflowWorker.java:151|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker%2FStreamingDataflowWorker.java=151=flow-p1-9c38=1])
 ## at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run 
([StreamingDataflowWorker.java:1053|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker%2FStreamingDataflowWorker.java=1053=flow-p1-9c38=1])
 # 
 ## at java.util.concurrent.ThreadPoolExecutor.runWorker 
(ThreadPoolExecutor.java:1142)
 ## at java.util.concurrent.ThreadPoolExecutor$Worker.run 
(ThreadPoolExecutor.java:617)
 ## at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.ClassNotFoundException: 
org.apache.avro.data.TimeConversions$TimestampConversion # 
 ## at java.net.URLClassLoader.findClass 
([URLClassLoader.java:381|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=java%2Fnet%2FURLClassLoader.java=381=flow-p1-9c38=1])
 ## at java.lang.ClassLoader.loadClass 
([ClassLoader.java:424|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=java%2Flang%2FClassLoader.java=424=flow-p1-9c38=1])
 ## at sun.misc.Launcher$AppClassLoader.loadClass 
([Launcher.java:335|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=sun%2Fmisc%2FLauncher.java=335=flow-p1-9c38=1])
 ## at java.lang.ClassLoader.loadClass 
([ClassLoader.java:357|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=java%2Flang%2FClassLoader.java=357=flow-p1-9c38=1])
 ## ```

> Update Avro to 1.9.1 from 1.8.2
> ---
>
> Key: BEAM-8388
> URL: 

[jira] [Comment Edited] (BEAM-8388) Update Avro to 1.9.1 from 1.8.2

2020-01-16 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017519#comment-17017519
 ] 

Aaron Dixon edited comment on BEAM-8388 at 1/16/20 9:51 PM:


Just to add more context/help here. The exception I see after upgrading from 
Beam 2.16->2.17 is this (running in Dataflow):

 
{code:java}
java.lang.NoClassDefFoundError: 
org/apache/avro/data/TimeConversions$TimestampConversion #

at org.apache.beam.sdk.coders.AvroCoder. (AvroCoder.java:269) at 
org.apache.beam.sdk.coders.AvroCoder.of (AvroCoder.java:121) at 
org.apache.beam.sdk.io.kafka.KafkaUnboundedSource.getCheckpointMarkCoder 
(KafkaUnboundedSource.java:131) at 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$UnboundedReader.iterator
 (WorkerCustomSources.java:419) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop
 (ReadOperation.java:178) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start 
(ReadOperation.java:159) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute
 (MapTaskExecutor.java:77) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process 
(StreamingDataflowWorker.java:1320) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000 
(StreamingDataflowWorker.java:151) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run 
(StreamingDataflowWorker.java:1053) at 
java.util.concurrent.ThreadPoolExecutor.runWorker 
(ThreadPoolExecutor.java:1142) at 
java.util.concurrent.ThreadPoolExecutor$Worker.run 
(ThreadPoolExecutor.java:617) at java.lang.Thread.run (Thread.java:748)
 Caused by: java.lang.ClassNotFoundException: 
org.apache.avro.data.TimeConversions$TimestampConversion # at 
java.net.URLClassLoader.findClass (URLClassLoader.java:381) at 
java.lang.ClassLoader.loadClass (ClassLoader.java:424) at 
sun.misc.Launcher$AppClassLoader.loadClass (Launcher.java:335) at 
java.lang.ClassLoader.loadClass (ClassLoader.java:357)   {code}


was (Author: atdixon):
Just to add more context/help here. The exception I see after upgrading from 
Beam 2.16->2.17 is this (running in Dataflow):

 
{code:java}
java.lang.NoClassDefFoundError: 
org/apache/avro/data/TimeConversions$TimestampConversion #

at org.apache.beam.sdk.coders.AvroCoder. (AvroCoder.java:269) at 
org.apache.beam.sdk.coders.AvroCoder.of (AvroCoder.java:121) at 
org.apache.beam.sdk.io.kafka.KafkaUnboundedSource.getCheckpointMarkCoder 
(KafkaUnboundedSource.java:131) at 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$UnboundedReader.iterator
 (WorkerCustomSources.java:419) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop
 (ReadOperation.java:178) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start 
(ReadOperation.java:159) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute
 (MapTaskExecutor.java:77) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process 
(StreamingDataflowWorker.java:1320) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000 
(StreamingDataflowWorker.java:151) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run 
(StreamingDataflowWorker.java:1053) at 
java.util.concurrent.ThreadPoolExecutor.runWorker 
(ThreadPoolExecutor.java:1142) at 
java.util.concurrent.ThreadPoolExecutor$Worker.run 
(ThreadPoolExecutor.java:617) at java.lang.Thread.run (Thread.java:748)
 Caused by: java.lang.ClassNotFoundException: 
org.apache.avro.data.TimeConversions$TimestampConversion # at 
java.net.URLClassLoader.findClass (URLClassLoader.java:381) at 
java.lang.ClassLoader.loadClass (ClassLoader.java:424) at 
sun.misc.Launcher$AppClassLoader.loadClass (Launcher.java:335) at 
java.lang.ClassLoader.loadClass (ClassLoader.java:357)   {code}
 # 
 ## 
 ## ```

> Update Avro to 1.9.1 from 1.8.2
> ---
>
> Key: BEAM-8388
> URL: https://issues.apache.org/jira/browse/BEAM-8388
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-avro
>Reporter: Jordanna Chord
>Assignee: Jordanna Chord
>Priority: Major
>   Original Estimate: 24h
>  Time Spent: 3h 20m
>  Remaining Estimate: 20h 40m
>
> Update build dependency to 1.9.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8388) Update Avro to 1.9.1 from 1.8.2

2020-01-16 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017519#comment-17017519
 ] 

Aaron Dixon edited comment on BEAM-8388 at 1/16/20 9:51 PM:


Just to add more context/help here. The exception I see after upgrading from 
Beam 2.16->2.17 is this (running in Dataflow):

 
{code:java}
java.lang.NoClassDefFoundError: 
org/apache/avro/data/TimeConversions$TimestampConversion #

at org.apache.beam.sdk.coders.AvroCoder. (AvroCoder.java:269) at 
org.apache.beam.sdk.coders.AvroCoder.of (AvroCoder.java:121) at 
org.apache.beam.sdk.io.kafka.KafkaUnboundedSource.getCheckpointMarkCoder 
(KafkaUnboundedSource.java:131) at 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$UnboundedReader.iterator
 (WorkerCustomSources.java:419) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop
 (ReadOperation.java:178) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start 
(ReadOperation.java:159) at 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute
 (MapTaskExecutor.java:77) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process 
(StreamingDataflowWorker.java:1320) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000 
(StreamingDataflowWorker.java:151) at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run 
(StreamingDataflowWorker.java:1053) at 
java.util.concurrent.ThreadPoolExecutor.runWorker 
(ThreadPoolExecutor.java:1142) at 
java.util.concurrent.ThreadPoolExecutor$Worker.run 
(ThreadPoolExecutor.java:617) at java.lang.Thread.run (Thread.java:748)
 Caused by: java.lang.ClassNotFoundException: 
org.apache.avro.data.TimeConversions$TimestampConversion # at 
java.net.URLClassLoader.findClass (URLClassLoader.java:381) at 
java.lang.ClassLoader.loadClass (ClassLoader.java:424) at 
sun.misc.Launcher$AppClassLoader.loadClass (Launcher.java:335) at 
java.lang.ClassLoader.loadClass (ClassLoader.java:357)   {code}
 # 
 ## 
 ## ```


was (Author: atdixon):
Just to add more context/help here. The exception I see after upgrading from 
Beam 2.16->2.17 is this (running in Dataflow):

```
java.lang.NoClassDefFoundError: 
org/apache/avro/data/TimeConversions$TimestampConversion # 
 ## at org.apache.beam.sdk.coders.AvroCoder. 
([AvroCoder.java:269|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.sdk.coders%2FAvroCoder.java=269=flow-p1-9c38=1])
 ## at org.apache.beam.sdk.coders.AvroCoder.of 
([AvroCoder.java:121|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.sdk.coders%2FAvroCoder.java=121=flow-p1-9c38=1])
 ## at org.apache.beam.sdk.io.kafka.KafkaUnboundedSource.getCheckpointMarkCoder 
([KafkaUnboundedSource.java:131|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.sdk.io.kafka%2FKafkaUnboundedSource.java=131=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$UnboundedReader.iterator
 
([WorkerCustomSources.java:419|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker%2FWorkerCustomSources.java=419=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop
 
([ReadOperation.java:178|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker.util.common.worker%2FReadOperation.java=178=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start 
([ReadOperation.java:159|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker.util.common.worker%2FReadOperation.java=159=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute
 
([MapTaskExecutor.java:77|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker.util.common.worker%2FMapTaskExecutor.java=77=flow-p1-9c38=1])
 ## at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process 
([StreamingDataflowWorker.java:1320|https://console.cloud.google.com/debug/fromlog?appModule=Dataflow%20Jobs=2020-01-15_09_46_24-15077023656704045070=org%2Fapache.beam.runners.dataflow.worker%2FStreamingDataflowWorker.java=1320=flow-p1-9c38=1])
 ## at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000 

[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-17 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018151#comment-17018151
 ] 

Aaron Dixon commented on BEAM-9144:
---

[~suztomo] thank you for looking on this, let me know if there's anything I can 
do to assist (qa/test, read code, etc...)

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-20 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019629#comment-17019629
 ] 

Aaron Dixon edited comment on BEAM-9144 at 1/20/20 4:34 PM:


Attempting to validate with 2.20.0-SNAPSHOT [1]

My pipeline however is unable to start in Dataflow, I get:

{noformat}
java.lang.ClassNotFoundException: 
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder
{noformat}

When I inspect my build artifacts I find that when I depend on Beam 2.16.0 
(last Beam build that works for me), I find this class present (v1p21p0). When 
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.

* I assume that the Dataflow runner needs to somehow know to look for v1p26p0 
somehow?  I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so 
that it can know to try to resolve the later versions of the rpc message (?) 
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?


===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:


{noformat}
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.pom
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.pom
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.pom
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.jar
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.jar
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.jar
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.jar
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.jar
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.jar
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.jar
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.jar
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.jar
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.jar
{noformat}


was (Author: atdixon):
Attempting to validate with 2.20.0-SNAPSHOT [1]

My pipeline however is unable to start in Dataflow, I get:
`java.lang.ClassNotFoundException: 
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder`

When I inspect my build artifacts I find that when I depend on Beam 2.16.0 
(last Beam build that works for me), I find this class present (v1p21p0). When 
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.

* I assume that the Dataflow runner needs to somehow know to look for v1p26p0 
somehow?  I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so 
that it can know to try to resolve the later versions of the rpc message (?) 
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?


===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:
```
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom

[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-20 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019629#comment-17019629
 ] 

Aaron Dixon commented on BEAM-9144:
---

Attempting to validate with 2.20.0-SNAPSHOT [1]

My pipeline however is unable to start in Dataflow, I get:
`java.lang.ClassNotFoundException: 
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder`

When I inspect my build artifacts I find that when I depend on Beam 2.16.0 
(last Beam build that works for me), I find this class present (v1p21p0). When 
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.

* I assume that the Dataflow runner needs to somehow know to look for v1p26p0 
somehow?  I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so 
that it can know to try to resolve the later versions of the rpc message (?) 
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?


===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:
```
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.pom
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.pom
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.pom
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.jar
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.jar
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.jar
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.jar
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.jar
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.jar
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.jar
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.jar
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.jar
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.jar
```

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime 

[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-21 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020344#comment-17020344
 ] 

Aaron Dixon commented on BEAM-9144:
---

[~suztomo] This appears to be a runner issue:

The exception (class load) is not occurring from my code but from the Dataflow 
runner system, from `StreamingDataflowWorker` class.

* beam-2.16.0 transitively includes `org.apache.beam.vendor.grpc.v1p21p0.*` 
packages.
   * The Dataflow runtime attempts to load v1p21p0 classes and finds them 
present.
* beam-2.20.0-SNAPS transitively includes 
`org.apache.beam.vendor.grpc.v1p26p0.*` packages (_not_ v1p21p0 classes).
   * Dataflow's runtime system tries to load `*.v1p21p0.*` but no longer finds 
them present. I expect it _should_ be attempting to load `*.v1p26p0.*` classes 
when the SDK version is 2.20.0-* .

Thoughts?

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-21 Thread Aaron Dixon (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Dixon updated BEAM-9144:
--
Attachment: dataflow_step_job_id_OBFUSC-0.json

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png, 
> dataflow_step_job_id_OBFUSC-0.json
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9144) Beam's own Avro TimeConversion class in beam-sdk-java-core

2020-01-21 Thread Aaron Dixon (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020387#comment-17020387
 ] 

Aaron Dixon commented on BEAM-9144:
---

[~suztomo] If it helps I'm attaching the portion of my Dataflow startup logs 
(which include the exception). [^dataflow_step_job_id_OBFUSC-0.json] 

An interesting segment of the log shows the Java command that starts the 
worker. You can see my packaged JAR in the classpath but also the Dataflow 
runner-framework JARs `/opt/google/dataflow/streaming/libWindmillServer.jar` 
and `/opt/google/dataflow/streaming/dataflow-worker.jar` -- I suspect the 
classes within these Dataflow system JARs are affecting the load of the old 
protobuf versions.

(Hope this helps.)

{noformat}
"Executing: java -Xmx5959858421 -XX:-OmitStackTraceInFastThrow 
-Xloggc:/var/log/dataflow/jvm-gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=512K -cp 
/opt/google/dataflow/streaming/libWindmillServer.jar:/opt/google/dataflow/streaming/dataflow-worker.jar:/opt/google/dataflow/slf4j/jcl_over_slf4j.jar:/opt/google/dataflow/slf4j/log4j_over_slf4j.jar:/opt/google/dataflow/slf4j/log4j_to_slf4j.jar:/var/opt/google/dataflow/**ARTIFACT_MISC-DISGUISED**-0.1.0-SNAPSHOT-standalone-Eat01V3rIGIwd4f4ctrF_w.jar
 -Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.port= 
-Dcom.sun.management.jmxremote.rmi.port= 
-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote=true 
-Ddataflow.worker.json.logging.location=/var/log/dataflow/dataflow.json.log 
-Ddataflow.worker.logging.filepath=/var/log/dataflow/dataflow-json.log 
-Ddataflow.worker.logging.location=/var/log/dataflow/dataflow-worker.log 
-Djava.rmi.server.hostname=localhost 
-Djava.security.properties=/opt/google/dataflow/tls/disable_gcm.properties 
-Djob_id=2020-01-20_08_13_54-12155510520957136076 
-Dsdk_pipeline_options_file=/var/opt/google/dataflow/pipeline_options.json 
-Dstatus_port=8081 
-Dwindmill.hostport=tcp://**JOB_NAME-DISGUISED**1579536-01200813-7s1g-harness-r5j8:12346
 -Dworker_id=**JOB_NAME-DISGUISED**1579536-01200813-7s1g-harness-r5j8 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker"
{noformat}


> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> ---
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png, 
> dataflow_step_job_id_OBFUSC-0.json
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion {
> @Override
> public Class getConvertedType() {
>   return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
>   return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>   return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>   return timestamp.getMillis();
> }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class