Chandan created SPARK-24226: ------------------------------- Summary: while reading data from oracle 12c from spark and using the numofpartition more than 1 is not returning the exact count Key: SPARK-24226 URL: https://issues.apache.org/jira/browse/SPARK-24226 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.2.0 Reporter: Chandan
Reading data from oracle using JDBC using spark sql context as below. val query = s"""(select col1,col2,rownum from schematic.tablename) A)""" val df = sparkcontextInstance.sqlcontext.read.("jdbc") .option("url", urlstring) .option("dbtable", query) .option("user", username) .option("password", password) .option("numPartitions", 20) .option("partitionColumn", "rownum") .option("lowerBound", 1) .option("upperBound", 3000000).option("fetchsize", 1500) .load() df.count() is returning only 150000 i.e upper bound/numpartition The table has 3 million records The table does not have any numerical column so taken rownum as partition column The above code is returning the data frame count -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org