Did you check the query plan / check the UI? That code looks same to me. Maybe you've only configured for one executor?
Gary On Oct 24, 2017 2:55 PM, "Naveen Madhire" <vmadh...@umail.iu.edu> wrote: > > Hi, > > > > I am trying to fetch data from Oracle DB using a subquery and experiencing > lot of performance issues. > > > > Below is the query I am using, > > > > *Using Spark 2.0.2* > > > > *val *df = spark_session.read.format(*"jdbc"*) > .option(*"driver"*,*"*oracle.jdbc.OracleDriver*"*) > .option(*"url"*, jdbc_url) > .option(*"user"*, user) > .option(*"password"*, pwd) > .option(*"dbtable"*, *"subquery"*) > .option(*"partitionColumn"*, *"id"*) //primary key column uniformly > distributed > .option(*"lowerBound"*, *"1"*) > .option(*"upperBound"*, *"500000"*) > .option(*"numPartitions"*, 30) > .load() > > > > The above query is running using the 30 partitions, but when I see the UI > it is only using 1 partiton to run the query. > > > > Can anyone tell if I am missing anything or do I need to anything else to > tune the performance of the query. > > *Thanks* >