[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2020-01-10 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012594#comment-17012594
 ] 

leesf commented on HUDI-114:


Fixed via master: 3c90d252cc464fbd4ec3554fc930e41a0fcaa29f

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie, Writer Core
>Reporter: Nishith Agarwal
>Assignee: Pratyaksh Sharma
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2019-12-03 Thread Pratyaksh Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986775#comment-16986775
 ] 

Pratyaksh Sharma commented on HUDI-114:
---

[~nishith29] Raised a PR for code changes, will be raising one for doc changes 
as well in some time. 

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie
>Reporter: Nishith Agarwal
>Assignee: Pratyaksh Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2019-11-08 Thread Nishith Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970427#comment-16970427
 ] 

Nishith Agarwal commented on HUDI-114:
--

[~Pratyaksh] The payload class could be a custom implementation of the 
HoodieRecordPayload. This is used to perform custom merge operations during 
compaction and query. 

The payload class right now is a fully qualified class name. Imagine if one 
refactors the code, renames the class or just wants to implement a new merge 
strategy, the compaction and queries would not be able to pick it up even if 
the class name is changed from the DeltaStreamer or from the SparkDataSource. 

I would not recommend rewriting the hoodie.properties file every time, this 
change is probably required very infrequently. A good approach would be to have 
a flag that let's one override the payload class in the hoodie.properties file 
when a user chooses to do so and add documentation for this. 

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie
>Reporter: Nishith Agarwal
>Assignee: Pratyaksh Sharma
>Priority: Minor
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2019-11-08 Thread Pratyaksh Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969936#comment-16969936
 ] 

Pratyaksh Sharma commented on HUDI-114:
---

Hi [~nishith29], Here is how I plan to do it -> 

When initialising DeltaSync, if suppose target base path already exists, I will 
simply overwrite the payload name every time in hoodie.properties file with the 
value passed by the user if the storage type is MERGE_ON_READ. Let me know if 
this makes sense to you. 

I went through the code, we store payload class name in hoodie.properties file 
if the table type is MERGE_ON_READ. I have not gone through the entire flow for 
MERGE_ON_READ table as of yet. It would be great if you could explain why do we 
want to implement this functionality of overwriting the payload class. I am not 
able to relate completely with this idea right now. With your valuable inputs, 
I will be able to check if I am missing any scenario. 

 

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie
>Reporter: Nishith Agarwal
>Assignee: Pratyaksh Sharma
>Priority: Minor
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2019-11-07 Thread Pratyaksh Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969206#comment-16969206
 ] 

Pratyaksh Sharma commented on HUDI-114:
---

[~nishith29] yeah I would like to have some more clarification before starting 
working on it. Precisely, I want to get more context on why one may need to 
pass new payload class name. It is already possible to configure it at run time 
using HoodieDeltaStreamer.Config class.

Also which is the datasource API you are talking about? 

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie
>Reporter: Nishith Agarwal
>Assignee: Pratyaksh Sharma
>Priority: Minor
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2019-11-06 Thread Nishith Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968525#comment-16968525
 ] 

Nishith Agarwal commented on HUDI-114:
--

[~Pratyaksh] Glad to have you pick this up! Let me know if you need more 
clarifications on this. 

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie
>Reporter: Nishith Agarwal
>Assignee: Nishith Agarwal
>Priority: Minor
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-114) Allow for clients to overwrite the payload implementation in hoodie.properties

2019-11-06 Thread Pratyaksh Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968225#comment-16968225
 ] 

Pratyaksh Sharma commented on HUDI-114:
---

[~nishith29] I would like to take it up. 

> Allow for clients to overwrite the payload implementation in hoodie.properties
> --
>
> Key: HUDI-114
> URL: https://issues.apache.org/jira/browse/HUDI-114
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: newbie
>Reporter: Nishith Agarwal
>Assignee: Nishith Agarwal
>Priority: Minor
>
> Right now, once the payload class is set once in hoodie.properties, it cannot 
> be changed. In some cases, if a code refactor is done and the jar updated, 
> one may need to pass the new payload class name.
> Also, fix picking up the payload name for datasource API. By default 
> HoodieAvroPayload is written whereas for datasource API default is 
> OverwriteLatestAvroPayload



--
This message was sent by Atlassian Jira
(v8.3.4#803005)