[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012594#comment-17012594 ] leesf commented on HUDI-114: Fixed via master: 3c90d252cc464fbd4ec3554fc930e41a0fcaa29f > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie, Writer Core >Reporter: Nishith Agarwal >Assignee: Pratyaksh Sharma >Priority: Minor > Labels: pull-request-available > Fix For: 0.5.1 > > Time Spent: 20m > Remaining Estimate: 0h > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986775#comment-16986775 ] Pratyaksh Sharma commented on HUDI-114: --- [~nishith29] Raised a PR for code changes, will be raising one for doc changes as well in some time. > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie >Reporter: Nishith Agarwal >Assignee: Pratyaksh Sharma >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970427#comment-16970427 ] Nishith Agarwal commented on HUDI-114: -- [~Pratyaksh] The payload class could be a custom implementation of the HoodieRecordPayload. This is used to perform custom merge operations during compaction and query. The payload class right now is a fully qualified class name. Imagine if one refactors the code, renames the class or just wants to implement a new merge strategy, the compaction and queries would not be able to pick it up even if the class name is changed from the DeltaStreamer or from the SparkDataSource. I would not recommend rewriting the hoodie.properties file every time, this change is probably required very infrequently. A good approach would be to have a flag that let's one override the payload class in the hoodie.properties file when a user chooses to do so and add documentation for this. > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie >Reporter: Nishith Agarwal >Assignee: Pratyaksh Sharma >Priority: Minor > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969936#comment-16969936 ] Pratyaksh Sharma commented on HUDI-114: --- Hi [~nishith29], Here is how I plan to do it -> When initialising DeltaSync, if suppose target base path already exists, I will simply overwrite the payload name every time in hoodie.properties file with the value passed by the user if the storage type is MERGE_ON_READ. Let me know if this makes sense to you. I went through the code, we store payload class name in hoodie.properties file if the table type is MERGE_ON_READ. I have not gone through the entire flow for MERGE_ON_READ table as of yet. It would be great if you could explain why do we want to implement this functionality of overwriting the payload class. I am not able to relate completely with this idea right now. With your valuable inputs, I will be able to check if I am missing any scenario. > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie >Reporter: Nishith Agarwal >Assignee: Pratyaksh Sharma >Priority: Minor > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969206#comment-16969206 ] Pratyaksh Sharma commented on HUDI-114: --- [~nishith29] yeah I would like to have some more clarification before starting working on it. Precisely, I want to get more context on why one may need to pass new payload class name. It is already possible to configure it at run time using HoodieDeltaStreamer.Config class. Also which is the datasource API you are talking about? > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie >Reporter: Nishith Agarwal >Assignee: Pratyaksh Sharma >Priority: Minor > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968525#comment-16968525 ] Nishith Agarwal commented on HUDI-114: -- [~Pratyaksh] Glad to have you pick this up! Let me know if you need more clarifications on this. > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Minor > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties
[ https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968225#comment-16968225 ] Pratyaksh Sharma commented on HUDI-114: --- [~nishith29] I would like to take it up. > Allow for clients to overwrite the payload implementation in hoodie.properties > -- > > Key: HUDI-114 > URL: https://issues.apache.org/jira/browse/HUDI-114 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: newbie >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Minor > > Right now, once the payload class is set once in hoodie.properties, it cannot > be changed. In some cases, if a code refactor is done and the jar updated, > one may need to pass the new payload class name. > Also, fix picking up the payload name for datasource API. By default > HoodieAvroPayload is written whereas for datasource API default is > OverwriteLatestAvroPayload -- This message was sent by Atlassian Jira (v8.3.4#803005)