Re: Using a Database to persist and load data from

2014-10-31 Thread Kamal Banga
You can also use PairRDDFunctions' saveAsNewAPIHadoopFile that takes an
OutputFormat class.
So you will have to write a custom OutputFormat class that extends
OutputFormat. In this class, you will have to implement a getRecordWriter
which returns a custom RecordWriter.
So you will also have to write a custom RecordWriter which extends
RecordWriter which will have a write method that actually writes to the DB.

On Fri, Oct 31, 2014 at 11:25 AM, Yanbo Liang yanboha...@gmail.com wrote:

 AFAIK, you can read data from DB with JdbcRDD, but there is no interface
 for writing to DB.
 JdbcRDD has some restrict such as  SQL must with where clause.
 For writing to DB, you can use mapPartitions or foreachPartition to
 implement.
 You can refer this example:

 http://stackoverflow.com/questions/24916852/how-can-i-connect-to-a-postgresql-database-into-apache-spark-using-scala

 2014-10-30 23:01 GMT+08:00 Asaf Lahav asaf.la...@gmail.com:

 Hi Ladies and Gents,
 I would like to know what are the options I have if I would like to
 leverage Spark code I already have written to use a DB (Vertica) as its
 store/datasource.
 The data is of tabular nature. So any relational DB can essentially be
 used.

 Do I need to develop a context? If yes, how? where can I get a good
 example?


 Thank you,
 Asaf





Re: Using a Database to persist and load data from

2014-10-31 Thread Sonal Goyal
I think you can try to use the Hadoop DBOutputFormat

Best Regards,
Sonal
Nube Technologies http://www.nubetech.co

http://in.linkedin.com/in/sonalgoyal



On Fri, Oct 31, 2014 at 1:00 PM, Kamal Banga ka...@sigmoidanalytics.com
wrote:

 You can also use PairRDDFunctions' saveAsNewAPIHadoopFile that takes an
 OutputFormat class.
 So you will have to write a custom OutputFormat class that extends
 OutputFormat. In this class, you will have to implement a getRecordWriter
 which returns a custom RecordWriter.
 So you will also have to write a custom RecordWriter which extends
 RecordWriter which will have a write method that actually writes to the DB.

 On Fri, Oct 31, 2014 at 11:25 AM, Yanbo Liang yanboha...@gmail.com
 wrote:

 AFAIK, you can read data from DB with JdbcRDD, but there is no interface
 for writing to DB.
 JdbcRDD has some restrict such as  SQL must with where clause.
 For writing to DB, you can use mapPartitions or foreachPartition to
 implement.
 You can refer this example:

 http://stackoverflow.com/questions/24916852/how-can-i-connect-to-a-postgresql-database-into-apache-spark-using-scala

 2014-10-30 23:01 GMT+08:00 Asaf Lahav asaf.la...@gmail.com:

 Hi Ladies and Gents,
 I would like to know what are the options I have if I would like to
 leverage Spark code I already have written to use a DB (Vertica) as its
 store/datasource.
 The data is of tabular nature. So any relational DB can essentially be
 used.

 Do I need to develop a context? If yes, how? where can I get a good
 example?


 Thank you,
 Asaf






Using a Database to persist and load data from

2014-10-30 Thread Asaf Lahav
Hi Ladies and Gents,
I would like to know what are the options I have if I would like to
leverage Spark code I already have written to use a DB (Vertica) as its
store/datasource.
The data is of tabular nature. So any relational DB can essentially be used.

Do I need to develop a context? If yes, how? where can I get a good example?


Thank you,
Asaf


Re: Using a Database to persist and load data from

2014-10-30 Thread Yanbo Liang
AFAIK, you can read data from DB with JdbcRDD, but there is no interface
for writing to DB.
JdbcRDD has some restrict such as  SQL must with where clause.
For writing to DB, you can use mapPartitions or foreachPartition to
implement.
You can refer this example:
http://stackoverflow.com/questions/24916852/how-can-i-connect-to-a-postgresql-database-into-apache-spark-using-scala

2014-10-30 23:01 GMT+08:00 Asaf Lahav asaf.la...@gmail.com:

 Hi Ladies and Gents,
 I would like to know what are the options I have if I would like to
 leverage Spark code I already have written to use a DB (Vertica) as its
 store/datasource.
 The data is of tabular nature. So any relational DB can essentially be
 used.

 Do I need to develop a context? If yes, how? where can I get a good
 example?


 Thank you,
 Asaf