Hi Jake,

There is a another option within the 3rd party projects in the spark
database ecosystem that have combined Spark with a DBMS in such a way that
DataFrame API has been extended to include UPDATE operations
<http://snappydatainc.github.io/snappydata/programming_guide/#create-row-tables-using-api-update-the-contents-of-row-table>.
However, in your case you would have to move away from MySQL in order to
use this API.

Best,

Pierce

On Tue, Aug 22, 2017 at 7:54 AM, Jake Russ <jr...@bloomintelligence.com>
wrote:

> Hi Mich,
>
>
>
> Thank you for the explanation, that makes sense, and is helpful for me to
> understand the bigger picture between Spark/RDBMS.
>
>
>
> Happy to know I’m already following best practice.
>
>
>
> Cheers,
>
>
>
> Jake
>
>
>
> *From: *Mich Talebzadeh <mich.talebza...@gmail.com>
> *Date: *Monday, August 21, 2017 at 6:44 PM
> *To: *Jake Russ <jr...@bloomintelligence.com>
> *Cc: *"user@spark.apache.org" <user@spark.apache.org>
> *Subject: *Re: Update MySQL table via Spark/SparkR?
>
>
>
> Hi Jake,
>
> This is an issue across all RDBMs including Oracle etc. When you are
> updating you have to commit or roll back in RDBMS itself and I am not aware
> of Spark doing that.
>
> The staging table is a safer method as it follows ETL type approach. You
> create new data in the staging table in RDBMS and do the DML in the RDBMS
> itself where you can control commit or rollback. That is the way I would do
> it. A simple shell script can do both.
>
> HTH
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn  
> *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
> On 21 August 2017 at 15:50, Jake Russ <jr...@bloomintelligence.com> wrote:
>
> Hi everyone,
>
>
>
> I’m currently using SparkR to read data from a MySQL database, perform
> some calculations, and then write the results back to MySQL. Is it still
> true that Spark does not support UPDATE queries via JDBC? I’ve seen many
> posts on the internet that Spark’s DataFrameWriter does not support
> UPDATE queries via JDBC
> <https://issues.apache.org/jira/browse/SPARK-19335>. It will only
> “append” or “overwrite” to existing tables. The best advice I’ve found so
> far, for performing this update, is to write to a staging table in MySQL
> <https://stackoverflow.com/questions/34643200/spark-dataframes-upsert-to-postgres-table>
>  and
> then perform the UPDATE query on the MySQL side.
>
>
>
> Ideally, I’d like to handle the update during the write operation. Has
> anyone else encountered this limitation and have a better solution?
>
>
>
> Thank you,
>
>
>
> Jake
>
>
>

Reply via email to