Re: Help SparkJDBCOperator

Driesprong, Fokko Thu, 07 Feb 2019 02:39:37 -0800

Hi Ivan,

The SparkJDBCOperator is an effort to replace Sqoop. For example, if you
run Spark on Kubernetes, you can also use Spark to do your Sqoop workloads.
Please keep in mind that this operator is not as rich in functionality as
Sqoop. The original PR is given here:
https://github.com/apache/airflow/pull/3021


The PySpark code is already given here:
https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_jdbc_script.py
The Operator will pass all the arguments to this script, so you won't have
to do this yourself.

You need to pass all the arguments to the operator, the PythonDoc is
self-explanatory:
https://github.com/apache/airflow/blob/master/airflow/contrib/operators/spark_jdbc_operator.py

For further reference, this operator is also called the Sqark (Sql +
Spark)Operator.

Hopefully, you're less lost now. If you have any further questions, let me
know.

Cheers, Fokko





Op vr 1 feb. 2019 om 13:54 schreef Iván Robla Albarrán <[email protected]
>:

> Hi ,
>
> I am seaching how to substitute Apache Sqoop
>
> I am analyzing SparkJDBCOperator, but i dont understand how i have to use .
>
> It a version of  SparkSubmit operator, for include as conection JDBC
> conection ?
>
>  I need to include Spark code?
>
> Any example?
>
> Thanks, I am very lost
>
> Regards,
> Iván Robla
>

Re: Help SparkJDBCOperator

Reply via email to