[jira] [Created] (SPARK-45437) Upgrade SNAPPY to 1.1.10.5 to pick up fix re Linux PLE64

2023-10-06 Thread N Campbell (Jira)
N Campbell created SPARK-45437:
--

 Summary: Upgrade SNAPPY to 1.1.10.5 to pick up fix re Linux PLE64
 Key: SPARK-45437
 URL: https://issues.apache.org/jira/browse/SPARK-45437
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.5.0
Reporter: N Campbell


SPARK-45323 move to Snappy 1.1.10.4 and is proposing to add to SPARK 3.5.1

Snappy prior to 1.1.10.5 will not work on Linux PLE 64.

Moving to Snappy 1.1.10.5 will address that issue
https://github.com/xerial/snappy-java/pull/515



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-45323) Upgrade snappy to 1.1.10.4

2023-10-05 Thread N Campbell (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-45323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17772424#comment-17772424
 ] 

N Campbell commented on SPARK-45323:


Is the same fix going to be back ported into a 3.5.x release?

Given SPARK 3.5 released with Snappy 1.1.10.3 ?

> Upgrade snappy to 1.1.10.4
> --
>
> Key: SPARK-45323
> URL: https://issues.apache.org/jira/browse/SPARK-45323
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 4.0.0, 3.5.1
>Reporter: Bjørn Jørgensen
>Assignee: Bjørn Jørgensen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Security Fix
> Fixed SnappyInputStream so as not to allocate too large memory when 
> decompressing data with an extremely large chunk size by @​tunnelshade (code 
> change)
> This does not affect users only using Snappy.compress/uncompress methods



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-20856) support statement using nested joins

2019-06-01 Thread N Campbell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell reopened SPARK-20856:


Per prior comment. This is an enhancement request where SPARKSQL was being 
asked to provide better parity to the joined table syntax that many systems 
support.

> support statement using nested joins
> 
>
> Key: SPARK-20856
> URL: https://issues.apache.org/jira/browse/SPARK-20856
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Major
>  Labels: bulk-closed
>
> While DB2, ORACLE etc support a join expressed as follows, SPARK SQL does 
> not. 
> Not supported
> select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum
> versus written as shown
> select * from 
>   cert.tsint tsint inner join cert.tint tint on tsint.rnum = tint.rnum inner 
> join cert.tbint tbint on tint.rnum = tbint.rnum
>
> ERROR_STATE, SQL state: org.apache.spark.sql.catalyst.parser.ParseException: 
> extraneous input 'on' expecting {, ',', '.', '[', 'WHERE', 'GROUP', 
> 'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 
> 'IS', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 'NATURAL', 
> 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', EQ, '<=>', 
> '<>', '!=', '<', LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', 
> '^', 'SORT', 'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 4, pos 5)
> == SQL ==
> select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum
> -^^^
> , Query: select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum.
> SQLState:  HY000
> ErrorCode: 500051



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20829) var_samp returns Nan while other vendors return a null value

2019-05-21 Thread N Campbell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-20829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844875#comment-16844875
 ] 

N Campbell commented on SPARK-20829:


No reason was given.

A. Apache SPARK does not want to align to ISO-SQL and will document the delta
B. Apache SPARK will add an option for those who want ISO-SQL behaviour

> var_samp returns Nan while other vendors return a null value
> 
>
> Key: SPARK-20829
> URL: https://issues.apache.org/jira/browse/SPARK-20829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Minor
>  Labels: bulk-closed
> Attachments: TSUPPLY
>
>
> SELECT
> sno AS SNO, 
> pno AS PNO, 
> VAR_SAMP(qty) AS C1
> FROM
> tsupply 
> GROUP BY 
> sno, 
> pno
> create table  if not exists TSUPPLY (RNUM int  , SNO string, PNO string, JNO 
> string, QTY int  )
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS textfile ;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20856) support statement using nested joins

2019-05-21 Thread N Campbell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844874#comment-16844874
 ] 

N Campbell commented on SPARK-20856:


An enhancement was bulk closed and incomplete. No reason given.

Unclear if APACHE SPARK team saying they have no intent of ever implementing 
the enhancement  vs a script which clobbered things.

> support statement using nested joins
> 
>
> Key: SPARK-20856
> URL: https://issues.apache.org/jira/browse/SPARK-20856
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Major
>  Labels: bulk-closed
>
> While DB2, ORACLE etc support a join expressed as follows, SPARK SQL does 
> not. 
> Not supported
> select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum
> versus written as shown
> select * from 
>   cert.tsint tsint inner join cert.tint tint on tsint.rnum = tint.rnum inner 
> join cert.tbint tbint on tint.rnum = tbint.rnum
>
> ERROR_STATE, SQL state: org.apache.spark.sql.catalyst.parser.ParseException: 
> extraneous input 'on' expecting {, ',', '.', '[', 'WHERE', 'GROUP', 
> 'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 
> 'IS', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 'NATURAL', 
> 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', EQ, '<=>', 
> '<>', '!=', '<', LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', 
> '^', 'SORT', 'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 4, pos 5)
> == SQL ==
> select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum
> -^^^
> , Query: select * from 
>   cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
>  on tbint.rnum = tint.rnum
>  on tint.rnum = tsint.rnum.
> SQLState:  HY000
> ErrorCode: 500051



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20827) cannot express HAVING without a GROUP BY clause

2019-05-21 Thread N Campbell (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-20827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844872#comment-16844872
 ] 

N Campbell commented on SPARK-20827:


An enhancement was bulk closed and incomplete. No reason given.

Unclear if APACHE SPARK team saying they have no intent of ever implementing 
the enhancement  vs a script which clobbered things.

> cannot express HAVING without a GROUP BY clause
> ---
>
> Key: SPARK-20827
> URL: https://issues.apache.org/jira/browse/SPARK-20827
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Minor
>  Labels: bulk-closed
>
> SPARK SQL does not support a HAVING clause without a GROUP BY which is valid 
> SQL and supported by other engines (ORACLE, DB2, )
> SELECT
> '' AS `C1`
> FROM
> `cert`.`tparts`
>  HAVING 
> COUNT(`pno`) > 0
> SQL state: java.lang.UnsupportedOperationException: Cannot evaluate 
> expression: count(input[0, string, true]), Query: SELECT
> '' AS `C1`
> FROM
> `cert`.`tparts`
>  HAVING 
> COUNT(`pno`) > 0.
> SQLState:  HY000
> ErrorCode: 500051



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20856) support statement using nested joins

2017-05-23 Thread N Campbell (JIRA)
N Campbell created SPARK-20856:
--

 Summary: support statement using nested joins
 Key: SPARK-20856
 URL: https://issues.apache.org/jira/browse/SPARK-20856
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.1.0
Reporter: N Campbell


While DB2, ORACLE etc support a join expressed as follows, SPARK SQL does not. 

Not supported
select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum

versus written as shown
select * from 
  cert.tsint tsint inner join cert.tint tint on tsint.rnum = tint.rnum inner 
join cert.tbint tbint on tint.rnum = tbint.rnum
   


ERROR_STATE, SQL state: org.apache.spark.sql.catalyst.parser.ParseException: 
extraneous input 'on' expecting {, ',', '.', '[', 'WHERE', 'GROUP', 
'ORDER', 'HAVING', 'LIMIT', 'OR', 'AND', 'IN', NOT, 'BETWEEN', 'LIKE', RLIKE, 
'IS', 'JOIN', 'CROSS', 'INNER', 'LEFT', 'RIGHT', 'FULL', 'NATURAL', 'LATERAL', 
'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', EQ, '<=>', '<>', '!=', '<', 
LTE, '>', GTE, '+', '-', '*', '/', '%', 'DIV', '&', '|', '^', 'SORT', 
'CLUSTER', 'DISTRIBUTE', 'ANTI'}(line 4, pos 5)

== SQL ==
select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum
-^^^
, Query: select * from 
  cert.tsint tsint inner join cert.tint tint inner join cert.tbint tbint
 on tbint.rnum = tint.rnum
 on tint.rnum = tsint.rnum.
SQLState:  HY000
ErrorCode: 500051





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20829) var_samp returns Nan while other vendors return a null value

2017-05-21 Thread N Campbell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell updated SPARK-20829:
---
Attachment: TSUPPLY

> var_samp returns Nan while other vendors return a null value
> 
>
> Key: SPARK-20829
> URL: https://issues.apache.org/jira/browse/SPARK-20829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Minor
> Attachments: TSUPPLY
>
>
> SELECT
> sno AS SNO, 
> pno AS PNO, 
> VAR_SAMP(qty) AS C1
> FROM
> tsupply 
> GROUP BY 
> sno, 
> pno
> create table  if not exists TSUPPLY (RNUM int  , SNO string, PNO string, JNO 
> string, QTY int  )
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS textfile ;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20829) var_samp returns Nan while other vendors return a null value

2017-05-21 Thread N Campbell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell updated SPARK-20829:
---
Summary: var_samp returns Nan while other vendors return a null value  
(was: var_sampe returns Nan while other vendors return a null value)

> var_samp returns Nan while other vendors return a null value
> 
>
> Key: SPARK-20829
> URL: https://issues.apache.org/jira/browse/SPARK-20829
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Minor
>
> SELECT
> sno AS SNO, 
> pno AS PNO, 
> VAR_SAMP(qty) AS C1
> FROM
> tsupply 
> GROUP BY 
> sno, 
> pno
> create table  if not exists TSUPPLY (RNUM int  , SNO string, PNO string, JNO 
> string, QTY int  )
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
>  STORED AS textfile ;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20829) var_sampe returns Nan while other vendors return a null value

2017-05-21 Thread N Campbell (JIRA)
N Campbell created SPARK-20829:
--

 Summary: var_sampe returns Nan while other vendors return a null 
value
 Key: SPARK-20829
 URL: https://issues.apache.org/jira/browse/SPARK-20829
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.0
Reporter: N Campbell
Priority: Minor


SELECT
sno AS SNO, 
pno AS PNO, 
VAR_SAMP(qty) AS C1
FROM
tsupply 
GROUP BY 
sno, 
pno


create table  if not exists TSUPPLY (RNUM int  , SNO string, PNO string, JNO 
string, QTY int  )
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
 STORED AS textfile ;






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20828) Concatenated grouping sets scenario not supported

2017-05-21 Thread N Campbell (JIRA)
N Campbell created SPARK-20828:
--

 Summary: Concatenated grouping sets scenario not supported 
 Key: SPARK-20828
 URL: https://issues.apache.org/jira/browse/SPARK-20828
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.0
Reporter: N Campbell


Following scenario supported by other vendors (i.e. ORACLE, DB2, ...) not 
supported by SPARK SQL

 WITH 
SQL1 AS 
(
SELECT
sno AS C1, 
pno AS C2, 
SUM(qty) AS C3
FROM
cert.tsupply 
GROUP BY 
ROLLUP(sno), 
CUBE(pno)
)
SELECT
SQL1.C1 AS C1, 
SQL1.C2 AS C2, 
SQL1.C3 AS C3
FROM
SQL1

Error: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error 
Code: ERROR_STATE, SQL state: org.apache.spark.sql.AnalysisException: 
expression 'tsupply.`sno`' is neither present in the group by, nor is it an 
aggregate function. Add to group by or wrap in first() (or first_value) if you 
don't care which value you get.;;
'Project ['SQL1.C1 AS C1#1517671, 'SQL1.C2 AS C2#1517672, 'SQL1.C3 AS 
C3#1517673]
+- 'SubqueryAlias SQL1
   +- 'Aggregate [rollup(sno#1517678), cube(pno#1517679)], [sno#1517678 AS 
C1#1517674, pno#1517679 AS C2#1517675, sum(cast(qty#1517681 as bigint)) AS 
C3#1517676L]
  +- MetastoreRelation cert, tsupply




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20827) cannot express HAVING without a GROUP BY clause

2017-05-21 Thread N Campbell (JIRA)
N Campbell created SPARK-20827:
--

 Summary: cannot express HAVING without a GROUP BY clause
 Key: SPARK-20827
 URL: https://issues.apache.org/jira/browse/SPARK-20827
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.0
Reporter: N Campbell


SPARK SQL does not support a HAVING clause without a GROUP BY which is valid 
SQL and supported by other engines (ORACLE, DB2, )

SELECT
'' AS `C1`
FROM
`cert`.`tparts`
 HAVING 
COUNT(`pno`) > 0

SQL state: java.lang.UnsupportedOperationException: Cannot evaluate expression: 
count(input[0, string, true]), Query: SELECT
'' AS `C1`
FROM
`cert`.`tparts`
 HAVING 
COUNT(`pno`) > 0.
SQLState:  HY000
ErrorCode: 500051



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9686) Spark Thrift server doesn't return correct JDBC metadata

2017-05-01 Thread N Campbell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990849#comment-15990849
 ] 

N Campbell commented on SPARK-9686:
---

Is this likely to be fixed? 

current forces companies to purchase commercial JDBC drivers as a work around.


> Spark Thrift server doesn't return correct JDBC metadata 
> -
>
> Key: SPARK-9686
> URL: https://issues.apache.org/jira/browse/SPARK-9686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2
>Reporter: pin_zhang
>Assignee: Cheng Lian
>Priority: Critical
> Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
>   Class.forName("org.apache.hive.jdbc.HiveDriver");
>   String URL = "jdbc:hive2://localhost:1/default";
>Properties info = new Properties();
> Connection conn = DriverManager.getConnection(URL, info);
>   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
>null, null, null);
> Problem:
>No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20545) union set operator should default to DISTINCT and not ALL semantics

2017-05-01 Thread N Campbell (JIRA)
N Campbell created SPARK-20545:
--

 Summary: union set operator should default to DISTINCT and not ALL 
semantics
 Key: SPARK-20545
 URL: https://issues.apache.org/jira/browse/SPARK-20545
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.0
Reporter: N Campbell


A set operation (i.e union) over two queries that produce identical row values 
should return the distinct set of rows and not all rows.

ISO-SQL set operation semantics default to DISTINCT 
SPARK implementation is defaulting to ALL
While SPARK allows DISTINCT keyword and some might assume ALL is faster, the 
wrong result set semantically is produced per standard (and commercial SQL 
systems including: ORACLE, DB2, Teradata, SQL Server etc.)

select tsint.csint from cert.tsint 
union 
select tint.cint from cert.tint 

csint

-1
0
1
10

-1
0
1
10


vs

select tsint.csint from cert.tsint union distinct select tint.cint from 
cert.tint 

csint
-1

1
10
0




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10777) order by fails when column is aliased and projection includes windowed aggregate

2015-09-23 Thread N Campbell (JIRA)
N Campbell created SPARK-10777:
--

 Summary: order by fails when column is aliased and projection 
includes windowed aggregate
 Key: SPARK-10777
 URL: https://issues.apache.org/jira/browse/SPARK-10777
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.5.0
Reporter: N Campbell


This statement fails in SPARK (works fine in ORACLE, DB2 )

select r as c1, min ( s ) over ()  as c2 from
( select rnum r, sum ( cint ) s from certstring.tint group by rnum ) t
order by r
Error: org.apache.spark.sql.AnalysisException: cannot resolve 'r' given input 
columns c1, c2; line 3 pos 9
SQLState:  null
ErrorCode: 0

Forcing the aliased column name works around the defect

select r as c1, min ( s ) over ()  as c2 from
( select rnum r, sum ( cint ) s from certstring.tint group by rnum ) t
order by c1

These work fine

select r as c1, min ( s ) over ()  as c2 from
( select rnum r, sum ( cint ) s from certstring.tint group by rnum ) t
order by c1

select r as c1, s  as c2 from
( select rnum r, sum ( cint ) s from certstring.tint group by rnum ) t
order by r


create table  if not exists TINT ( RNUM int , CINT int   )
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
 STORED AS ORC  ;




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10744) parser error (constant * column is null interpreted as constant * boolean)

2015-09-21 Thread N Campbell (JIRA)
N Campbell created SPARK-10744:
--

 Summary: parser error (constant * column is null interpreted as 
constant * boolean)
 Key: SPARK-10744
 URL: https://issues.apache.org/jira/browse/SPARK-10744
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.5.0
Reporter: N Campbell
Priority: Minor


SPARK SQL inherits the same defect as Hive where this statement will not 
parse/execute. See HIVE-9530 

 select c1 from t1 where 1 * cnnull is null 
-vs- 
 select c1 from t1 where (1 * cnnull) is null 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10747) add support for window specification to include how NULLS are ordered

2015-09-21 Thread N Campbell (JIRA)
N Campbell created SPARK-10747:
--

 Summary: add support for window specification to include how NULLS 
are ordered
 Key: SPARK-10747
 URL: https://issues.apache.org/jira/browse/SPARK-10747
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.5.0
Reporter: N Campbell


You cannot express how NULLS are to be sorted in the window order specification 
and have to use a compensating expression to simulate.

Error: org.apache.spark.sql.AnalysisException: line 1:76 missing ) at 'nulls' 
near 'nulls'
line 1:82 missing EOF at 'last' near 'nulls';
SQLState:  null

Same limitation as Hive reported in Apache JIRA HIVE-9535 )

This fails
select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by c3 desc 
nulls last) from tolap

select rnum, c1, c2, c3, dense_rank() over(partition by c1 order by case when 
c3 is null then 1 else 0 end) from tolap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-10507) reject temporal expressions such as timestamp - timestamp at parse time

2015-09-09 Thread N Campbell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell updated SPARK-10507:
---
Description: 
TIMESTAMP - TIMESTAMP in ISO-SQL should return an interval type which SPARK 
does not support.. 

A similar expression in Hive 0.13 fails with Error: Could not create ResultSet: 
Required field 'type' is unset! Struct:TPrimitiveTypeEntry(type:null) and SPARK 
has similar "challenges". While Hive 1.2.1 has added some interval type support 
it is far from complete with respect to ISO-SQL. 

The ability to compute the period of time (years, days, weeks, hours, ...) 
between timestamps or add/substract intervals from a timestamp are extremely 
common in business applications. 

Currently, a value expression such as select timestampcol - timestampcol from t 
will fail during execution and not parse time. While the error thrown states 
that fact, it would better for those value expressions to be rejected at parse 
time along with indicating the expression that is causing the parser error.


Operation: execute
Errors:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 6214.0 failed 4 times, most recent failure: Lost task 0.3 in stage 6214.0 
(TID 21208, sandbox.hortonworks.com): java.lang.RuntimeException: Type 
TimestampType does not support numeric operations
at scala.sys.package$.error(package.scala:27)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.numeric$lzycompute(arithmetic.scala:138)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.numeric(arithmetic.scala:136)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.eval(arithmetic.scala:150)
at 
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:113)
at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:813)
at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:813)
at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498)

create table  if not exists TTS ( RNUM int , CTS timestamp )TERMINATED BY '\n' 
 STORED AS orc  ;


  was:
TIMESTAMP - TIMESTAMP in ISO-SQL is an interval type. Hive 0.13 fails with 
Error: Could not create ResultSet: Required field 'type' is unset! 
Struct:TPrimitiveTypeEntry(type:null) and SPARK has similar "challenges".

select cts - cts from tts 



Operation: execute
Errors:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 6214.0 failed 4 times, most recent failure: Lost task 0.3 in stage 6214.0 
(TID 21208, sandbox.hortonworks.com): java.lang.RuntimeException: Type 
TimestampType does not support numeric operations
at scala.sys.package$.error(package.scala:27)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.numeric$lzycompute(arithmetic.scala:138)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.numeric(arithmetic.scala:136)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.eval(arithmetic.scala:150)
at 
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:113)
at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 

[jira] [Created] (SPARK-10502) tidy up the exception message text to be less verbose/"User friendly"

2015-09-08 Thread N Campbell (JIRA)
N Campbell created SPARK-10502:
--

 Summary: tidy up the exception message text to be less 
verbose/"User friendly"
 Key: SPARK-10502
 URL: https://issues.apache.org/jira/browse/SPARK-10502
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
Reporter: N Campbell
Priority: Minor


When a statement is parsed, it would be preferred is the exception text were 
more aligned with other vendors re indicating the syntax error without the 
inclusion of the verbose parse tree.


 select tbint.rnum,tbint.cbint, nth_value( tbint.cbint, '4' ) over ( order by 
tbint.rnum) from certstring.tbint 


Errors:
org.apache.spark.sql.AnalysisException: 
Unsupported language features in query: select tbint.rnum,tbint.cbint, 
nth_value( tbint.cbint, '4' ) over ( order by tbint.rnum) from certstring.tbint
TOK_QUERY 1, 0,40, 94
  TOK_FROM 1, 36,40, 94
TOK_TABREF 1, 38,40, 94
  TOK_TABNAME 1, 38,40, 94
certstring 1, 38,38, 94
tbint 1, 40,40, 105
  TOK_INSERT 0, -1,34, 0
TOK_DESTINATION 0, -1,-1, 0
  TOK_DIR 0, -1,-1, 0
TOK_TMP_FILE 0, -1,-1, 0
TOK_SELECT 1, 0,34, 12
  TOK_SELEXPR 1, 2,4, 12
. 1, 2,4, 12
  TOK_TABLE_OR_COL 1, 2,2, 7
tbint 1, 2,2, 7
  rnum 1, 4,4, 13
  TOK_SELEXPR 1, 6,8, 23
. 1, 6,8, 23
  TOK_TABLE_OR_COL 1, 6,6, 18
tbint 1, 6,6, 18
  cbint 1, 8,8, 24
  TOK_SELEXPR 1, 11,34, 31
TOK_FUNCTION 1, 11,34, 31
  nth_value 1, 11,11, 31
  . 1, 14,16, 47
TOK_TABLE_OR_COL 1, 14,14, 42
  tbint 1, 14,14, 42
cbint 1, 16,16, 48
  '4' 1, 19,19, 55
  TOK_WINDOWSPEC 1, 25,34, 82
TOK_PARTITIONINGSPEC 1, 27,33, 82
  TOK_ORDERBY 1, 27,33, 82
TOK_TABSORTCOLNAMEASC 1, 31,33, 82
  . 1, 31,33, 82
TOK_TABLE_OR_COL 1, 31,31, 77
  tbint 1, 31,31, 77
rnum 1, 33,33, 83

scala.NotImplementedError: No parse rules for ASTNode type: 882, text: 
TOK_WINDOWSPEC :
TOK_WINDOWSPEC 1, 25,34, 82
  TOK_PARTITIONINGSPEC 1, 27,33, 82
TOK_ORDERBY 1, 27,33, 82
  TOK_TABSORTCOLNAMEASC 1, 31,33, 82
. 1, 31,33, 82
  TOK_TABLE_OR_COL 1, 31,31, 77
tbint 1, 31,31, 77
  rnum 1, 33,33, 83
" +
 
org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1261)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10503) incorrect predicate evaluation involving NULL value

2015-09-08 Thread N Campbell (JIRA)
N Campbell created SPARK-10503:
--

 Summary: incorrect predicate evaluation involving NULL value
 Key: SPARK-10503
 URL: https://issues.apache.org/jira/browse/SPARK-10503
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: N Campbell


Query an ORC table in Hive using the following SQL statement via the SPARKSQL 
thrift-server. The row were rnum=0 has a c1 value of null. The resultset 
returned by SPARK includes a row where rnum=0 and c1=0 which is incorrect

select tint.rnum, tint.rnum from tint where tint.cint in ( tint.cint )

table in Hive
create table  if not exists TINT ( RNUM int , CINT smallint   )
TERMINATED BY '\n' 
 STORED AS orc  ;

data loaded into ORC table is

0|\N
1|-1
2|0
3|1
4|10




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-10508) incorrect evaluation of searched case expression

2015-09-08 Thread N Campbell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell updated SPARK-10508:
---
Summary: incorrect evaluation of searched case expression  (was: incorrect 
evaluation of search case expression)

> incorrect evaluation of searched case expression
> 
>
> Key: SPARK-10508
> URL: https://issues.apache.org/jira/browse/SPARK-10508
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.3.1
>Reporter: N Campbell
>
> The following case expression never evaluates to 'test1' when cdec is -1 or 
> 10 as it will for Hive 0.13. Instead is returns 'other' for all rows.
> select rnum, cdec, case when cdec in ( -1,10,0.1 )  then 'test1' else 'other' 
> end from tdec 
> create table  if not exists TDEC ( RNUM int , CDEC decimal(7, 2 ))
> TERMINATED BY '\n' 
>  STORED AS orc  ;
> 0|\N
> 1|-1.00
> 2|0.00
> 3|1.00
> 4|0.10
> 5|10.00



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10508) incorrect evaluation of search case expression

2015-09-08 Thread N Campbell (JIRA)
N Campbell created SPARK-10508:
--

 Summary: incorrect evaluation of search case expression
 Key: SPARK-10508
 URL: https://issues.apache.org/jira/browse/SPARK-10508
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
Reporter: N Campbell


The following case expression never evaluates to 'test1' when cdec is -1 or 10 
as it will for Hive 0.13. Instead is returns 'other' for all rows.

select rnum, cdec, case when cdec in ( -1,10,0.1 )  then 'test1' else 'other' 
end from tdec 

create table  if not exists TDEC ( RNUM int , CDEC decimal(7, 2 ))
TERMINATED BY '\n' 
 STORED AS orc  ;


0|\N
1|-1.00
2|0.00
3|1.00
4|0.10
5|10.00




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10504) aggregate where NULL is defined as the value expression aborts when SUM used

2015-09-08 Thread N Campbell (JIRA)
N Campbell created SPARK-10504:
--

 Summary: aggregate where NULL is defined as the value expression 
aborts when SUM used
 Key: SPARK-10504
 URL: https://issues.apache.org/jira/browse/SPARK-10504
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
Reporter: N Campbell
Priority: Minor


In ISO-SQL the context would determine an implicit type for NULL or one might 
find that a vendor requires an explicit type via CAST ( NULL as INTEGER). It 
appears that SPARK presumes a long type i.e. select min(NULL), max(NULL) but a 
query such the following aborts.
 
select sum ( null  )  from tversion


Operation: execute
Errors:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 5232.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5232.0 
(TID 18531, sandbox.hortonworks.com): scala.MatchError: NullType (of class 
org.apache.spark.sql.types.NullType$)
at 
org.apache.spark.sql.catalyst.expressions.Cast.org$apache$spark$sql$catalyst$expressions$Cast$$cast(Cast.scala:403)
at 
org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:422)
at org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:422)
at org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:426)
at 
org.apache.spark.sql.catalyst.expressions.Coalesce.eval(nullFunctions.scala:51)
at 
org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:119)
at 
org.apache.spark.sql.catalyst.expressions.Coalesce.eval(nullFunctions.scala:51)
at 
org.apache.spark.sql.catalyst.expressions.MutableLiteral.update(literals.scala:82)
at 
org.apache.spark.sql.catalyst.expressions.SumFunction.update(aggregates.scala:581)
at 
org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$6.apply(Aggregate.scala:133)
at 
org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$6.apply(Aggregate.scala:126)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10505) windowed form of count ( star ) fails with No handler for udf class

2015-09-08 Thread N Campbell (JIRA)
N Campbell created SPARK-10505:
--

 Summary: windowed form of count ( star ) fails with No handler for 
udf class
 Key: SPARK-10505
 URL: https://issues.apache.org/jira/browse/SPARK-10505
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
Reporter: N Campbell


The following statement will parse/execute in Hive 0.13 but fails in SPARK. 

create a simple ORC table in Hive 
create table  if not exists TOLAP (RNUM int , C1 string, C2 string, C3 int, C4 
int) TERMINATED BY '\n' 
 STORED AS orc ;

select rnum, c1, c2, c3, count(*) over(partition by c1) from tolap

Error: java.lang.RuntimeException: No handler for udf class 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount
SQLState:  null
ErrorCode: 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-10507) timestamp - timestamp

2015-09-08 Thread N Campbell (JIRA)
N Campbell created SPARK-10507:
--

 Summary: timestamp - timestamp 
 Key: SPARK-10507
 URL: https://issues.apache.org/jira/browse/SPARK-10507
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
Reporter: N Campbell


TIMESTAMP - TIMESTAMP in ISO-SQL is an interval type. Hive 0.13 fails with 
Error: Could not create ResultSet: Required field 'type' is unset! 
Struct:TPrimitiveTypeEntry(type:null) and SPARK has similar "challenges".

select cts - cts from tts 



Operation: execute
Errors:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 6214.0 failed 4 times, most recent failure: Lost task 0.3 in stage 6214.0 
(TID 21208, sandbox.hortonworks.com): java.lang.RuntimeException: Type 
TimestampType does not support numeric operations
at scala.sys.package$.error(package.scala:27)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.numeric$lzycompute(arithmetic.scala:138)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.numeric(arithmetic.scala:136)
at 
org.apache.spark.sql.catalyst.expressions.Subtract.eval(arithmetic.scala:150)
at 
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:113)
at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:813)
at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:813)
at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1498)

create table  if not exists TTS ( RNUM int , CTS timestamp )TERMINATED BY '\n' 
 STORED AS orc  ;




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org