[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2020-08-13 Thread ZhouDaHong (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177455#comment-17177455
 ] 

ZhouDaHong commented on SPARK-9182:
---

Hello, it seems that the problem is that the "Sal" field is of numerical type, 
but in the actual SQL process, it is impossible to match the numeric value non 
equivalently. Try changing the "Sal" field to int or double.

> filter and groupBy on DataFrames are not passed through to jdbc source
> --
>
> Key: SPARK-9182
> URL: https://issues.apache.org/jira/browse/SPARK-9182
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
>Reporter: Greg Rahn
>Assignee: Yijie Shen
>Priority: Critical
>
> When running all of these API calls, the only one that passes the filter 
> through to the backend jdbc source is equality.  All filters in these 
> commands should be able to be passed through to the jdbc database source.
> {code}
> val url="jdbc:postgresql:grahn"
> val prop = new java.util.Properties
> val emp = sqlContext.read.jdbc(url, "emp", prop)
> emp.filter(emp("sal") === 5000).show()
> emp.filter(emp("sal") < 5000).show()
> emp.filter("sal = 3000").show()
> emp.filter("sal > 2500").show()
> emp.filter("sal >= 2500").show()
> emp.filter("sal < 2500").show()
> emp.filter("sal <= 2500").show()
> emp.filter("sal != 3000").show()
> emp.filter("sal between 3000 and 5000").show()
> emp.filter("ename in ('SCOTT','BLAKE')").show()
> {code}
> We see from the PostgreSQL query log the following is run, and see that only 
> equality predicates are passed through.
> {code}
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 5000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 3000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-12-03 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037670#comment-15037670
 ] 

Hyukjin Kwon commented on SPARK-9182:
-

Filed here https://issues.apache.org/jira/browse/SPARK-12126.

> filter and groupBy on DataFrames are not passed through to jdbc source
> --
>
> Key: SPARK-9182
> URL: https://issues.apache.org/jira/browse/SPARK-9182
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
>Reporter: Greg Rahn
>Assignee: Yijie Shen
>Priority: Critical
>
> When running all of these API calls, the only one that passes the filter 
> through to the backend jdbc source is equality.  All filters in these 
> commands should be able to be passed through to the jdbc database source.
> {code}
> val url="jdbc:postgresql:grahn"
> val prop = new java.util.Properties
> val emp = sqlContext.read.jdbc(url, "emp", prop)
> emp.filter(emp("sal") === 5000).show()
> emp.filter(emp("sal") < 5000).show()
> emp.filter("sal = 3000").show()
> emp.filter("sal > 2500").show()
> emp.filter("sal >= 2500").show()
> emp.filter("sal < 2500").show()
> emp.filter("sal <= 2500").show()
> emp.filter("sal != 3000").show()
> emp.filter("sal between 3000 and 5000").show()
> emp.filter("ename in ('SCOTT','BLAKE')").show()
> {code}
> We see from the PostgreSQL query log the following is run, and see that only 
> equality predicates are passed through.
> {code}
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 5000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 3000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-11-30 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031526#comment-15031526
 ] 

Cheng Lian commented on SPARK-9182:
---

Converting all {{expressions.Filter}} to {{sources.Filter}} basically means 
that we need to write a new expression library, which might not worth the 
effort.

This reminds me the experimental {{CatalystScan}} trait, which had once been 
used for {{ParquetRelation}} to handling partition pruning before 
{{HadoopFsRelation}} was added. Since JDBC is a builtin data source, maybe we 
can use similar tricks to pass Catalyst expressions rather than 
{{sources.Filter}} directly to it, so that it can make smarter decisions.

> filter and groupBy on DataFrames are not passed through to jdbc source
> --
>
> Key: SPARK-9182
> URL: https://issues.apache.org/jira/browse/SPARK-9182
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
>Reporter: Greg Rahn
>Assignee: Yijie Shen
>Priority: Critical
>
> When running all of these API calls, the only one that passes the filter 
> through to the backend jdbc source is equality.  All filters in these 
> commands should be able to be passed through to the jdbc database source.
> {code}
> val url="jdbc:postgresql:grahn"
> val prop = new java.util.Properties
> val emp = sqlContext.read.jdbc(url, "emp", prop)
> emp.filter(emp("sal") === 5000).show()
> emp.filter(emp("sal") < 5000).show()
> emp.filter("sal = 3000").show()
> emp.filter("sal > 2500").show()
> emp.filter("sal >= 2500").show()
> emp.filter("sal < 2500").show()
> emp.filter("sal <= 2500").show()
> emp.filter("sal != 3000").show()
> emp.filter("sal between 3000 and 5000").show()
> emp.filter("ename in ('SCOTT','BLAKE')").show()
> {code}
> We see from the PostgreSQL query log the following is run, and see that only 
> equality predicates are passed through.
> {code}
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 5000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 3000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-11-29 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031266#comment-15031266
 ] 

Hyukjin Kwon commented on SPARK-9182:
-

Just a thought, Currently {{DataSourceStrategy.translateFilter}} just converts 
{{expressions.Filter}} to {{sources.Filter}} which are commonly convertable.

However, for other datasources, more or less filters could be processed. As far 
as I know, to exclude some filters, {{unhanldedFilters}} interface is added by 
https://issues.apache.org/jira/browse/SPARK-10978.

To include some more filters for example for JDBC and others such as 
Elasticsearch or Solr,  should we better just convert all 
{{expressions.Filter}} to {{sources.Filter}} to hide the internals and then let 
the {{unhanldedFilters}} select the filters that it can process?

Even though adding this logic to the Spark internal datasources (namely 
correcting the Parquet or ORC datasources to get rid of duplicated filters) 
should also be done, this still would be advantageous as this would remove 
Spark-side filtering (currently, the internal datasources filter data twice at 
Spark side and also datasource side).


> filter and groupBy on DataFrames are not passed through to jdbc source
> --
>
> Key: SPARK-9182
> URL: https://issues.apache.org/jira/browse/SPARK-9182
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
>Reporter: Greg Rahn
>Assignee: Yijie Shen
>Priority: Critical
>
> When running all of these API calls, the only one that passes the filter 
> through to the backend jdbc source is equality.  All filters in these 
> commands should be able to be passed through to the jdbc database source.
> {code}
> val url="jdbc:postgresql:grahn"
> val prop = new java.util.Properties
> val emp = sqlContext.read.jdbc(url, "emp", prop)
> emp.filter(emp("sal") === 5000).show()
> emp.filter(emp("sal") < 5000).show()
> emp.filter("sal = 3000").show()
> emp.filter("sal > 2500").show()
> emp.filter("sal >= 2500").show()
> emp.filter("sal < 2500").show()
> emp.filter("sal <= 2500").show()
> emp.filter("sal != 3000").show()
> emp.filter("sal between 3000 and 5000").show()
> emp.filter("ename in ('SCOTT','BLAKE')").show()
> {code}
> We see from the PostgreSQL query log the following is run, and see that only 
> equality predicates are passed through.
> {code}
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 5000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 3000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-10-13 Thread Davies Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955646#comment-14955646
 ] 

Davies Liu commented on SPARK-9182:
---

For JDBC, I think we could push more stuff (for example, a + b > 3) into remote 
database, which include casting. This is more useful for JDBC than other file 
based data sources, we may could spend more efforts on it.

> filter and groupBy on DataFrames are not passed through to jdbc source
> --
>
> Key: SPARK-9182
> URL: https://issues.apache.org/jira/browse/SPARK-9182
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
>Reporter: Greg Rahn
>Assignee: Yijie Shen
>Priority: Critical
>
> When running all of these API calls, the only one that passes the filter 
> through to the backend jdbc source is equality.  All filters in these 
> commands should be able to be passed through to the jdbc database source.
> {code}
> val url="jdbc:postgresql:grahn"
> val prop = new java.util.Properties
> val emp = sqlContext.read.jdbc(url, "emp", prop)
> emp.filter(emp("sal") === 5000).show()
> emp.filter(emp("sal") < 5000).show()
> emp.filter("sal = 3000").show()
> emp.filter("sal > 2500").show()
> emp.filter("sal >= 2500").show()
> emp.filter("sal < 2500").show()
> emp.filter("sal <= 2500").show()
> emp.filter("sal != 3000").show()
> emp.filter("sal between 3000 and 5000").show()
> emp.filter("ename in ('SCOTT','BLAKE')").show()
> {code}
> We see from the PostgreSQL query log the following is run, and see that only 
> equality predicates are passed through.
> {code}
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 5000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 3000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-09-11 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740577#comment-14740577
 ] 

Apache Spark commented on SPARK-9182:
-

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/8718

> filter and groupBy on DataFrames are not passed through to jdbc source
> --
>
> Key: SPARK-9182
> URL: https://issues.apache.org/jira/browse/SPARK-9182
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.1
>Reporter: Greg Rahn
>Assignee: Yijie Shen
>Priority: Critical
>
> When running all of these API calls, the only one that passes the filter 
> through to the backend jdbc source is equality.  All filters in these 
> commands should be able to be passed through to the jdbc database source.
> {code}
> val url="jdbc:postgresql:grahn"
> val prop = new java.util.Properties
> val emp = sqlContext.read.jdbc(url, "emp", prop)
> emp.filter(emp("sal") === 5000).show()
> emp.filter(emp("sal") < 5000).show()
> emp.filter("sal = 3000").show()
> emp.filter("sal > 2500").show()
> emp.filter("sal >= 2500").show()
> emp.filter("sal < 2500").show()
> emp.filter("sal <= 2500").show()
> emp.filter("sal != 3000").show()
> emp.filter("sal between 3000 and 5000").show()
> emp.filter("ename in ('SCOTT','BLAKE')").show()
> {code}
> We see from the PostgreSQL query log the following is run, and see that only 
> equality predicates are passed through.
> {code}
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 5000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp WHERE 
> sal = 3000
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> LOG:  execute : SET extra_float_digits = 3
> LOG:  execute : SELECT 
> "empno","ename","job","mgr","hiredate","sal","comm","deptno" FROM emp
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-12 Thread Yijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694727#comment-14694727
 ] 

Yijie Shen commented on SPARK-9182:
---

reverted it via https://github.com/apache/spark/pull/8157

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn
Assignee: Yijie Shen
Priority: Critical
 Fix For: 1.5.0


 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-12 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694739#comment-14694739
 ] 

Cheng Lian commented on SPARK-9182:
---

[~grahn] Unfortunately we found a regression in the previous fix and have to 
revert it. Before a proper fix is delivered, this issue can be worked around by 
explicit casting over the literal values in the filter. Namely, using
{noformat}
emp.filter(sal  cast(2500 as decimal(7, 2)))
{noformat}
instead of
{noformat}
emp.filter(sal  2500)
{noformat}


 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn
Assignee: Yijie Shen
Priority: Critical
 Fix For: 1.5.0


 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-08 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662871#comment-14662871
 ] 

Apache Spark commented on SPARK-9182:
-

User 'yjshen' has created a pull request for this issue:
https://github.com/apache/spark/pull/8049

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn
Assignee: Cheng Lian
Priority: Critical

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-06 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660278#comment-14660278
 ] 

Cheng Lian commented on SPARK-9182:
---

Hey [~grahn], sorry for the late reply, I somehow missed your last two comments.

Thanks for the detailed information. I'm able to reproduce this issue locally 
now. Confirmed that it's related to NUMERIC. Trying to deliver a fix for this.

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-07-23 Thread Greg Rahn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14639065#comment-14639065
 ] 

Greg Rahn commented on SPARK-9182:
--

Looks like it's related to NUMERIC data types from a quick test.

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-07-23 Thread Greg Rahn (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14639036#comment-14639036
 ] 

Greg Rahn commented on SPARK-9182:
--

{code}
grahn=# \d emp
  Table public.emp
  Column  | Type  | Modifiers
--+---+---
 empno| numeric(4,0)  |
 ename| character varying(10) |
 job  | character varying(9)  |
 mgr  | numeric(4,0)  |
 hiredate | date  |
 sal  | numeric(7,2)  |
 comm | numeric(7,2)  |
 deptno   | numeric(2,0)  |

grahn=# select * from emp;
 empno | ename  |job| mgr  |  hiredate  |   sal   |  comm   | deptno
---++---+--++-+-+
  7369 | SMITH  | CLERK | 7902 | 1980-12-17 |  800.00 | | 20
  7499 | ALLEN  | SALESMAN  | 7698 | 1981-02-20 | 1600.00 |  300.00 | 30
  7521 | WARD   | SALESMAN  | 7698 | 1981-02-22 | 1250.00 |  500.00 | 30
  7566 | JONES  | MANAGER   | 7839 | 1981-04-02 | 2975.00 | | 20
  7654 | MARTIN | SALESMAN  | 7698 | 1981-09-28 | 1250.00 | 1400.00 | 30
  7698 | BLAKE  | MANAGER   | 7839 | 1981-05-01 | 2850.00 | | 30
  7782 | CLARK  | MANAGER   | 7839 | 1981-06-09 | 2450.00 | | 10
  7788 | SCOTT  | ANALYST   | 7566 | 1982-12-09 | 3000.00 | | 20
  7839 | KING   | PRESIDENT |  | 1981-11-17 | 5000.00 | | 10
  7844 | TURNER | SALESMAN  | 7698 | 1981-09-08 | 1500.00 |0.00 | 30
  7876 | ADAMS  | CLERK | 7788 | 1983-01-12 | 1100.00 | | 20
  7900 | JAMES  | CLERK | 7698 | 1981-12-03 |  950.00 | | 30
  7902 | FORD   | ANALYST   | 7566 | 1981-12-03 | 3000.00 | | 20
  7934 | MILLER | CLERK | 7782 | 1982-01-23 | 1300.00 | | 10
(14 rows)

{code}

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-07-23 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638427#comment-14638427
 ] 

Cheng Lian commented on SPARK-9182:
---

I suspect that the type of column {{sal}} is {{text}} rather than any numeric 
type. Reproduced this issue with the following setup.

Table DDL:
{code:sql}
create table t (a int, b real, c double precision, d text);
{code}
Test data:
{code:sql}
insert into t values (1, 1.1, 1.2, '1000');
insert into t values (2, 2.1, 2.2, '2000');
{code}
Spark shell snippet:
{code}
val url = jdbc:postgresql:postgres
val props = new java.util.Properties
val t = sqlContext.read.jdbc(url, t, props)
t.filter('d  1500).show()
{code}
Corresponding PostgreSQL log:
{noformat}
LOG:  execute unnamed: SET extra_float_digits = 3
LOG:  execute unnamed: SELECT a,b,c,d FROM t
{noformat}

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-07-23 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638421#comment-14638421
 ] 

Cheng Lian commented on SPARK-9182:
---

[~grahn] Could you please provide the schema of the table? Especially I'd like 
to know the data types of involved columns.

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-07-22 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636688#comment-14636688
 ] 

Cheng Lian commented on SPARK-9182:
---

I'm looking into this.

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-07-22 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636686#comment-14636686
 ] 

Cheng Lian commented on SPARK-9182:
---

I'm looking into this.

 filter and groupBy on DataFrames are not passed through to jdbc source
 --

 Key: SPARK-9182
 URL: https://issues.apache.org/jira/browse/SPARK-9182
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.1
Reporter: Greg Rahn

 When running all of these API calls, the only one that passes the filter 
 through to the backend jdbc source is equality.  All filters in these 
 commands should be able to be passed through to the jdbc database source.
 {code}
 val url=jdbc:postgresql:grahn
 val prop = new java.util.Properties
 val emp = sqlContext.read.jdbc(url, emp, prop)
 emp.filter(emp(sal) === 5000).show()
 emp.filter(emp(sal)  5000).show()
 emp.filter(sal = 3000).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal  2500).show()
 emp.filter(sal = 2500).show()
 emp.filter(sal != 3000).show()
 emp.filter(sal between 3000 and 5000).show()
 emp.filter(ename in ('SCOTT','BLAKE')).show()
 {code}
 We see from the PostgreSQL query log the following is run, and see that only 
 equality predicates are passed through.
 {code}
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 5000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp WHERE 
 sal = 3000
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 LOG:  execute unnamed: SET extra_float_digits = 3
 LOG:  execute unnamed: SELECT 
 empno,ename,job,mgr,hiredate,sal,comm,deptno FROM emp
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org