[
https://issues.apache.org/jira/browse/SPARK-12521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071432#comment-15071432
]
Herman van Hovell commented on SPARK-12521:
-------------------------------------------
The {{lowerBound}} and {{upperBound}} parameters are used for partitioning,
they are not supposed to be filters on your data. Also see:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala#L40-L47
When a row falls outside the {{lowerBound}} - {{upperBound}} interval, it is
loaded in a separate partition.
> DataFrame Partitions in java does not work
> ------------------------------------------
>
> Key: SPARK-12521
> URL: https://issues.apache.org/jira/browse/SPARK-12521
> Project: Spark
> Issue Type: Bug
> Components: Java API
> Affects Versions: 1.5.2
> Reporter: Sergey Podolsky
>
> Hello,
> Partition does not work in Java interface of the DataFrame:
> {code}
> SQLContext sqlContext = new SQLContext(sc);
> Map<String, String> options = new HashMap<>();
> options.put("driver", ORACLE_DRIVER);
> options.put("url", ORACLE_CONNECTION_URL);
> options.put("dbtable",
> "(SELECT * FROM JOBS WHERE ROWNUM < 10000) tt");
> options.put("lowerBound", "2704225000");
> options.put("upperBound", "2704226000");
> options.put("partitionColumn", "ID");
> options.put("numPartitions", "10");
> DataFrame jdbcDF = sqlContext.load("jdbc", options);
> List<Row> jobsRows = jdbcDF.collectAsList();
> System.out.println(jobsRows.size());
> {code}
> gives 9999 while expected 1000. Is it because of big decimal of boundaries or
> partitioins does not work at all in Java?
> Thanks.
> Sergey
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]