[ https://issues.apache.org/jira/browse/SPARK-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555680#comment-14555680 ]
Paul Wu commented on SPARK-7804: -------------------------------- Unfortunately, JdbcRDD was poorly designed since the lowerbound and upperbound are long types which are too limited. One of my team member implemented a general one based on the idea. Some of my team are worried about the home-made solution. When we saw JDBCRDD, it looks like what we wanted. In fact, I hope JDBCRDD can be public or JdbcRDD can be re-designed to take care general situation just like what JDBCRDD does. > Incorrect results from JDBCRDD -- one record repeatly > ----------------------------------------------------- > > Key: SPARK-7804 > URL: https://issues.apache.org/jira/browse/SPARK-7804 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.3.0, 1.3.1 > Reporter: Paul Wu > > Getting only one record repeated in the RDD and repeated field value: > > I have a table like: > {code} > attuid name email > 12 john j...@appp.com > 23 tom t...@appp.com > 34 tony t...@appp.com > {code} > My code: > {code} > JavaSparkContext sc = new JavaSparkContext(sparkConf); > String url = "...."; > java.util.Properties prop = new Properties(); > List<JDBCPartition> partitionList = new ArrayList<>(); > //int i; > partitionList.add(new JDBCPartition("1=1", 0)); > > List<StructField> fields = new ArrayList<StructField>(); > fields.add(DataTypes.createStructField("attuid", > DataTypes.StringType, true)); > fields.add(DataTypes.createStructField("name", DataTypes.StringType, > true)); > fields.add(DataTypes.createStructField("email", DataTypes.StringType, > true)); > StructType schema = DataTypes.createStructType(fields); > JDBCRDD jdbcRDD = new JDBCRDD(sc.sc(), > JDBCRDD.getConnector("oracle.jdbc.OracleDriver", url, prop), > > schema, > " USERS", > new String[]{"attuid", "name", "email"}, > new Filter[]{ }, > > partitionList.toArray(new JDBCPartition[0]) > > ); > > System.out.println("count before to Java RDD=" + > jdbcRDD.cache().count()); > JavaRDD<Row> jrdd = jdbcRDD.toJavaRDD(); > System.out.println("count=" + jrdd.count()); > List<Row> lr = jrdd.collect(); > for (Row r : lr) { > for (int ii = 0; ii < r.length(); ii++) { > System.out.println(r.getString(ii)); > } > } > {code} > =========================== > result is : > {code} > 34 > tony > t...@appp.com > 34 > tony > t...@appp.com > 34 > tony > t...@appp.com > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org