cdmikechen opened a new issue #1144: Is it possible to support SQL-like method?
URL: https://github.com/apache/incubator-hudi/issues/1144
 
 
   As we know, Hudi use spark datasource api to upsert data. For example, if we 
want to update a data, we need to get the old row's data first, and use upsert 
method to update this row. 
   But there's another situation where someone just wants to update one column 
of data. If we use a sql to describe, it is `update table set col1 = X where 
col2 = Y`. This is something hudi cannot deal with directly at present, we can 
only get all the data involved as a dataset first and then merge it.
   So I think maybe we can create a new subproject to process the batch data in 
an sql-like method. For example.
   ```
   val hudiTable = new HudiTable(path)
   hudiTable.update.set("col1 = X").where("col2 = Y")
   hudiTable.delete.where("col3 = Z")
   hudiTable.commit
   ```
   It may also extend the functionality and support jdbc-like RFC schemes:  
https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller
   
   Hope every one can provide some suggestions to see if this plan is feasible.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to