[ https://issues.apache.org/jira/browse/BEAM-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259488#comment-16259488 ]
Luke Cwik commented on BEAM-3227: --------------------------------- That makes a lot of sense. > Consider sharing Udf/SkdFunctionSpec records via pointer > -------------------------------------------------------- > > Key: BEAM-3227 > URL: https://issues.apache.org/jira/browse/BEAM-3227 > Project: Beam > Issue Type: Sub-task > Components: beam-model > Reporter: Kenneth Knowles > > Coders are stored by pointer, because they are often repeated and a common > source of huge pipeline descriptions. > We considered doing the same for all UDFs but decided not to, based on the > logic that they are not as often identical and will rarely implement the > equals() needed to actually share encoded versions. > However, in the presence of generated code, it is very likely that DoFns and > CombineFns are repeated, and also much more likely that they have meaningful > equals(), so there could be size savings. > None of this is terribly important for storage or transmission, but has more > to do with arbitrary and small size limits that occur in some API frameworks > or database column types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)