[ 
https://issues.apache.org/jira/browse/BEAM-11457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251822#comment-17251822
 ] 

Alexey Romanenko edited comment on BEAM-11457 at 12/18/20, 3:23 PM:
--------------------------------------------------------------------

Well, I think I was mistaken and it's a NOT good idea to use Hadoop 
Configuration to configure the internal behavior of HadoopFormatIO. In the same 
time, I see that your change will affect Key and Values classes in the same 
time. Is it possible that, a user would want to not clone only key or value 
class but not both? 

I also was thinking about providing a way to update the list of immutable 
classes {{immutableTypes}}. Now it's a static Set but we could wrap it into 
getters/setters and also provide a user API to update it, like, for example, a 
new method {{withImmutableTypesUpdates(Set<Class<?>> immutableTypes)}} which 
will update a current Set. Wdyt?


was (Author: aromanenko):
Well, I think it's a NOT good idea to use Hadoop Configuration to configure the 
internal behavior of HadoopFormatIO. In the same time, I see that your change 
will affect Key and Values classes in the same time. Is it possible that, a 
user would want to not clone only key or value class but not both? 

I also was thinking about providing a way to update the list of immutable 
classes {{immutableTypes}}. Now it's a static Set but we could wrap it into 
getters/setters and also provide a user API to update it, like, for example, a 
new method {{withImmutableTypesUpdates(Set<Class<?>> immutableTypes)}} which 
will update a current Set. Wdyt?

> Enable skip key-value clone for HadoopFormatIO 
> -----------------------------------------------
>
>                 Key: BEAM-11457
>                 URL: https://issues.apache.org/jira/browse/BEAM-11457
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-hadoop-format
>    Affects Versions: 2.25.0
>            Reporter: Jozef Vilcek
>            Assignee: Jozef Vilcek
>            Priority: P3
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HadoopFormatIO eagerly clone key-values if they are not a well known 
> immutable types. This make sense due to how hadoop Writables behave. However, 
> user can use key value translation functions which possibly already output 
> immutable types. In such case it would be of benefit if extra clone via coder 
> can be avoided.
> It would be great if coder can be consulted on the type an it's need for 
> clone. However I am not aware if such detection is possible. I propose to add 
> config parameter for skipping the clone which can be used by IO user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to