Thanks, I got that I can handle a pool by my own when dealing with foreachPartition, etc. My question is mainly related to what happens is such scenario.
..... val df: DataFrame = hiveSqlContext.read.format("jdbc").options(options).load(); df.registerTempTable("V_RELATIONS"); ..... I can register df as a temp table for later use in my App. What happens to underling connections used by when accessing that dataframe? Are all closed when a 'timeout' is reached or are kept opened for later user? What I'm trying to understand is what happens is case I access that table later, for example, via thrift server of from my code. If every time DF is accessed and connections have been closed, it is a performance penalty to reopen then anytime. Then, especially for Oracle, it is also costly... Does anyone have a better understanding of this? Thanks again! Marco 2016-03-25 13:42 GMT+01:00 manasdebashiskar <poorinsp...@gmail.com>: > Yes there is. > You can use the default dbcp or your own preferred connection pool manager. > Then when you ask for a connection you get one from the pool. > > Take a look at this > https://github.com/manasdebashiskar/kafka-exactly-once > It is forked from Cody's repo. > > ..Manas > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-and-DB-connection-pool-tp26577p26596.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Ing. Marco Colombo