[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=364278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364278
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 29/Dec/19 03:37
Start Date: 29/Dec/19 03:37
Worklog Time Spent: 10m 
  Work Description: chrlarsen commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-569471049
 
 
   Run CommunityMetrics PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364278)
Time Spent: 9h 40m  (was: 9.5h)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=364273=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364273
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 29/Dec/19 02:14
Start Date: 29/Dec/19 02:14
Worklog Time Spent: 10m 
  Work Description: chrlarsen commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-569467327
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364273)
Time Spent: 9.5h  (was: 9h 20m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=364271=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364271
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 29/Dec/19 00:19
Start Date: 29/Dec/19 00:19
Worklog Time Spent: 10m 
  Work Description: chrlarsen commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-569462219
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364271)
Time Spent: 9h 20m  (was: 9h 10m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=364269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364269
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 29/Dec/19 00:18
Start Date: 29/Dec/19 00:18
Worklog Time Spent: 10m 
  Work Description: chrlarsen commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-569462167
 
 
   Run CommunityMetrics PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364269)
Time Spent: 9h  (was: 8h 50m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=364270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364270
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 29/Dec/19 00:18
Start Date: 29/Dec/19 00:18
Worklog Time Spent: 10m 
  Work Description: chrlarsen commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-569462167
 
 
   Run CommunityMetrics PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364270)
Time Spent: 9h 10m  (was: 9h)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9030) Bump the version of GRPC to 1.22.0+(May be latest 1.26.0, currently 1.21.0)

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9030?focusedWorklogId=364267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364267
 ]

ASF GitHub Bot logged work on BEAM-9030:


Author: ASF GitHub Bot
Created on: 28/Dec/19 23:56
Start Date: 28/Dec/19 23:56
Worklog Time Spent: 10m 
  Work Description: suztomo commented on pull request #10463: [BEAM-9030] 
Bump grpc to 1.26.0
URL: https://github.com/apache/beam/pull/10463#discussion_r361819262
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoringOld.groovy
 ##
 @@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.gradle
+
+import org.gradle.api.Project
+
+/**
+ * Utilities for working with our vendored version of gRPC.
+ */
+class GrpcVendoringOld {
 
 Review comment:
   > How about making a bit improvement in 
https://beam.apache.org/contribute/release-guide/ for separate release process?
   
   Yes, I expect that’s the first step (Can Lukasz take this?). Then add the 
URL to “vendor/grpc-1_26_0/build.gradle”.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364267)
Time Spent: 3h  (was: 2h 50m)

> Bump the version of GRPC to 1.22.0+(May be latest 1.26.0, currently 1.21.0)
> ---
>
> Key: BEAM-9030
> URL: https://issues.apache.org/jira/browse/BEAM-9030
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> When submitting a Python word count job to a Flink session/standalone cluster 
> repeatedly, the meta space usage of the task manager of the Flink cluster 
> will continuously increase (about 40MB each time). The reason is that the 
> Beam classes are loaded with the user class loader in Flink and there are 
> problems with the implementation of `ProcessManager`(from Beam) and 
> `ThreadPoolCache`(from netty) which may cause the user class loader could not 
> be garbage collected even after the job finished which causes the meta space 
> memory leak eventually. You can refer to FLINK-15338[1] for more information.
> Regarding to `ProcessManager`, I have created a JIRA BEAM-9006[2] to track 
> it. Regarding to `ThreadPoolCache`, it is a Netty problem and has been fixed 
> in NETTY#8955[3]. Netty 4.1.35 Final has already included this fix and GRPC 
> 1.22.0 has already dependents on Netty 4.1.35 Final. So we need to bump the 
> version of GRPC to 1.22.0+ (currently 1.21.0).
>  
> What do you think?
> [1] https://issues.apache.org/jira/browse/FLINK-15338
> [2] https://issues.apache.org/jira/browse/BEAM-9006
> [3] [https://github.com/netty/netty/pull/8955]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9030) Bump the version of GRPC to 1.22.0+(May be latest 1.26.0, currently 1.21.0)

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9030?focusedWorklogId=364268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364268
 ]

ASF GitHub Bot logged work on BEAM-9030:


Author: ASF GitHub Bot
Created on: 28/Dec/19 23:56
Start Date: 28/Dec/19 23:56
Worklog Time Spent: 10m 
  Work Description: suztomo commented on pull request #10463: [BEAM-9030] 
Bump grpc to 1.26.0
URL: https://github.com/apache/beam/pull/10463#discussion_r361819287
 
 

 ##
 File path: vendor/grpc-1_26_0/build.gradle
 ##
 @@ -20,7 +20,7 @@ import org.apache.beam.gradle.GrpcVendoring
 
 plugins { id 'org.apache.beam.vendor-java' }
 
-description = "Apache Beam :: Vendored Dependencies :: gRPC :: 1.21.0"
+description = "Apache Beam :: Vendored Dependencies :: gRPC :: 1.26.0"
 
 Review comment:
   I suggest adding source code comment “When upgrading the version, refer to 
http://...”
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364268)
Time Spent: 3h  (was: 2h 50m)

> Bump the version of GRPC to 1.22.0+(May be latest 1.26.0, currently 1.21.0)
> ---
>
> Key: BEAM-9030
> URL: https://issues.apache.org/jira/browse/BEAM-9030
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> When submitting a Python word count job to a Flink session/standalone cluster 
> repeatedly, the meta space usage of the task manager of the Flink cluster 
> will continuously increase (about 40MB each time). The reason is that the 
> Beam classes are loaded with the user class loader in Flink and there are 
> problems with the implementation of `ProcessManager`(from Beam) and 
> `ThreadPoolCache`(from netty) which may cause the user class loader could not 
> be garbage collected even after the job finished which causes the meta space 
> memory leak eventually. You can refer to FLINK-15338[1] for more information.
> Regarding to `ProcessManager`, I have created a JIRA BEAM-9006[2] to track 
> it. Regarding to `ThreadPoolCache`, it is a Netty problem and has been fixed 
> in NETTY#8955[3]. Netty 4.1.35 Final has already included this fix and GRPC 
> 1.22.0 has already dependents on Netty 4.1.35 Final. So we need to bump the 
> version of GRPC to 1.22.0+ (currently 1.21.0).
>  
> What do you think?
> [1] https://issues.apache.org/jira/browse/FLINK-15338
> [2] https://issues.apache.org/jira/browse/BEAM-9006
> [3] [https://github.com/netty/netty/pull/8955]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8505) Too many variations of FnApiRunnerTest are running

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8505?focusedWorklogId=364260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364260
 ]

ASF GitHub Bot logged work on BEAM-8505:


Author: ASF GitHub Bot
Created on: 28/Dec/19 20:59
Start Date: 28/Dec/19 20:59
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #9910: [BEAM-8505] 
Reducing the variations of FnApiRunnerTest
URL: https://github.com/apache/beam/pull/9910#issuecomment-569450754
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364260)
Time Spent: 1h  (was: 50m)

> Too many variations of FnApiRunnerTest are running
> --
>
> Key: BEAM-8505
> URL: https://issues.apache.org/jira/browse/BEAM-8505
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Pablo Estrada
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> These variations add up to make the Python Precommit suite very slow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=364248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364248
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Dec/19 16:21
Start Date: 28/Dec/19 16:21
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #10422: [BEAM-2535] 
TimerData signature update
URL: https://github.com/apache/beam/pull/10422
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364248)
Time Spent: 9.5h  (was: 9h 20m)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=364178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364178
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 28/Dec/19 09:21
Start Date: 28/Dec/19 09:21
Worklog Time Spent: 10m 
  Work Description: chrlarsen commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-569401090
 
 
   @chamikaramj @steveniemitz @gsteelman the PR has been refactored to 
read/write Thrift encoded data. It would be great to get some more feedback, 
thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364178)
Time Spent: 8h 50m  (was: 8h 40m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-28 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9035.
--
Fix Version/s: 2.19.0
   Resolution: Fixed

Ready for review

> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?focusedWorklogId=364171=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364171
 ]

ASF GitHub Bot logged work on BEAM-9035:


Author: ASF GitHub Bot
Created on: 28/Dec/19 08:54
Start Date: 28/Dec/19 08:54
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10413: [BEAM-9035] 
Typed options for Row Schema and Field
URL: https://github.com/apache/beam/pull/10413#issuecomment-569399464
 
 
   @TheNeuralBit @reuvenlax the first PR of the schema options feature is ready 
for review. Reuven already done a pre-review when this PR was in draft. See 
tickets for more details, but coming in later pull requests are: options on 
Logical types (removing metadata and making LT portable), conversion from proto 
and avro, ...
   
   note: it seems the build infrastructure for java is hanging. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 364171)
Time Spent: 0.5h  (was: 20m)

> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)