[jira] [Comment Edited] (SPARK-13141) Dataframe created from Hive partitioned tables using HiveContext returns wrong results

2016-03-01 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174895#comment-15174895
 ] 

zhichao-li edited comment on SPARK-13141 at 3/2/16 2:29 AM:


Just try, but this cannot be reproduced from the master version by : 

create table mn.logs (field1 string, field2 string, field3 string)
partitioned by (year string, month string , day string, host string)
row format delimited fields terminated by ',';

insert into logs partition (year="2013", month="07", day="28", host="host1") 
values ("foo","foo","foo")

hc.table("logs").show()


 as you mentioned, not sure if it's specific to the version of CDH 5.5.1


was (Author: zhichao-li):
Just try, but this cannot be reproduced from the master version by the sql: 
`create table mn.logs (field1 string, field2 string, field3 string)
partitioned by (year string, month string , day string, host string)
row format delimited fields terminated by ',';` as you mentioned, not sure if 
it's specific to the version of CDH 5.5.1

> Dataframe created from Hive partitioned tables using HiveContext returns 
> wrong results
> --
>
> Key: SPARK-13141
> URL: https://issues.apache.org/jira/browse/SPARK-13141
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.5.0
> Environment: CDH 5.5.1
>Reporter: Simone
>Priority: Critical
>
> I get wrong dataframe results using HiveContext with Spark 1.5.0 on CDH 5.5.1 
> in yarn-client mode.
> The problem occurs with partitioned tables on text delimited HDFS data, both 
> with Scala and Python.
> This an example code:
> import org.apache.spark.sql.hive.HiveContext
> val hc = new HiveContext(sc)
> hc.table("my_db.partition_table").show()
> The result is that all values of all rows are NULL, except from the first 
> column (that contains the whole line of data) and the partitioning columns, 
> which appears to be correct.
> With Hive and Impala I get correct results.
> Also with Spark on the same data with a not partitioned table I get correct 
> results.
> I think that similar problems occurs also with Avro data:
> https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Pyspark-Table-Dataframe-returning-empty-records-from-Partitioned/td-p/35836



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13141) Dataframe created from Hive partitioned tables using HiveContext returns wrong results

2016-03-01 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174895#comment-15174895
 ] 

zhichao-li commented on SPARK-13141:


Just try, but this cannot be reproduced from the master version by the sql: 
`create table mn.logs (field1 string, field2 string, field3 string)
partitioned by (year string, month string , day string, host string)
row format delimited fields terminated by ',';` as you mentioned, not sure if 
it's specific to the version of CDH 5.5.1

> Dataframe created from Hive partitioned tables using HiveContext returns 
> wrong results
> --
>
> Key: SPARK-13141
> URL: https://issues.apache.org/jira/browse/SPARK-13141
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.5.0
> Environment: CDH 5.5.1
>Reporter: Simone
>Priority: Critical
>
> I get wrong dataframe results using HiveContext with Spark 1.5.0 on CDH 5.5.1 
> in yarn-client mode.
> The problem occurs with partitioned tables on text delimited HDFS data, both 
> with Scala and Python.
> This an example code:
> import org.apache.spark.sql.hive.HiveContext
> val hc = new HiveContext(sc)
> hc.table("my_db.partition_table").show()
> The result is that all values of all rows are NULL, except from the first 
> column (that contains the whole line of data) and the partitioning columns, 
> which appears to be correct.
> With Hive and Impala I get correct results.
> Also with Spark on the same data with a not partitioned table I get correct 
> results.
> I think that similar problems occurs also with Avro data:
> https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Pyspark-Table-Dataframe-returning-empty-records-from-Partitioned/td-p/35836



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-12820) Resolve column with full qualified names: db.table.column

2016-01-13 Thread zhichao-li (JIRA)
zhichao-li created SPARK-12820:
--

 Summary: Resolve column with full qualified names: db.table.column
 Key: SPARK-12820
 URL: https://issues.apache.org/jira/browse/SPARK-12820
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Reporter: zhichao-li
Priority: Minor


Currently spark only support to specify col name like: `table.col`, or `col` in 
projection, but it's very common that user use `db.table.col` especially when 
join table across database.
Hive doesn't support this for now but it has been used in lot of other 
traditional db like mysql.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-12789) Support order by index

2016-01-12 Thread zhichao-li (JIRA)
zhichao-li created SPARK-12789:
--

 Summary: Support order by index
 Key: SPARK-12789
 URL: https://issues.apache.org/jira/browse/SPARK-12789
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: zhichao-li
Priority: Minor


 Num in Order by is treated as constant expression at the moment. I guess it 
would be good to enable user to specify column by index which has been 
supported in Hive 0.11.0 and later. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-11517) Calc partitions in parallel for multiple partitions table

2015-11-04 Thread zhichao-li (JIRA)
zhichao-li created SPARK-11517:
--

 Summary: Calc partitions in parallel for multiple partitions table
 Key: SPARK-11517
 URL: https://issues.apache.org/jira/browse/SPARK-11517
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: zhichao-li
Priority: Minor


Currently we calculate the getPartitions for each "hive partition" in sequence 
way, it would be faster if we can parallel this on driver side. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-11121) Incorrect TaskLocation type

2015-10-14 Thread zhichao-li (JIRA)
zhichao-li created SPARK-11121:
--

 Summary: Incorrect TaskLocation type
 Key: SPARK-11121
 URL: https://issues.apache.org/jira/browse/SPARK-11121
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: zhichao-li
Priority: Minor


"toString" is the only difference between HostTaskLocation and 
HDFSCacheTaskLocation for the moment, but it would be better to correct this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-9626) Add python api for base64, crc32, pmod, factorial and conv functions

2015-08-04 Thread zhichao-li (JIRA)
zhichao-li created SPARK-9626:
-

 Summary: Add python api for base64, crc32, pmod, factorial and 
conv functions
 Key: SPARK-9626
 URL: https://issues.apache.org/jira/browse/SPARK-9626
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Reporter: zhichao-li
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-9626) Add python api for base64, crc32, pmod, factorial and conv functions

2015-08-04 Thread zhichao-li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhichao-li closed SPARK-9626.
-
Resolution: Duplicate

duplicated with SPARK-9513

 Add python api for base64, crc32, pmod, factorial and conv functions
 

 Key: SPARK-9626
 URL: https://issues.apache.org/jira/browse/SPARK-9626
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Reporter: zhichao-li
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-9238) two extra useless entries for bytesOfCodePointInUTF8

2015-07-21 Thread zhichao-li (JIRA)
zhichao-li created SPARK-9238:
-

 Summary: two extra useless entries for bytesOfCodePointInUTF8
 Key: SPARK-9238
 URL: https://issues.apache.org/jira/browse/SPARK-9238
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: zhichao-li
Priority: Trivial


Only a trial thing, not sure if I understand correctly or not but I guess only 
2 entries in bytesOfCodePointInUTF8 for the case of 6 bytes codepoint(110x) 
is enough.
Details can be found from https://en.wikipedia.org/wiki/UTF-8 in Description 
section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8227) math function: unhex

2015-06-16 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587694#comment-14587694
 ] 

zhichao-li commented on SPARK-8227:
---

typo. pls ignore this one.

 math function: unhex
 

 Key: SPARK-8227
 URL: https://issues.apache.org/jira/browse/SPARK-8227
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: zhichao-li

 unhex(STRING a): BINARY
 Inverse of hex. Interprets each pair of characters as a hexadecimal number 
 and converts to the byte representation of the number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8206) math function: round

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579943#comment-14579943
 ] 

zhichao-li commented on SPARK-8206:
---

I will take this one

 math function: round
 

 Key: SPARK-8206
 URL: https://issues.apache.org/jira/browse/SPARK-8206
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 round(double a): double
 Returns the rounded BIGINT value of a.
 round(double a, INT d): double
 Returns a rounded to d decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8220) math function: positive

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579961#comment-14579961
 ] 

zhichao-li commented on SPARK-8220:
---

I will take this one

 math function: positive
 ---

 Key: SPARK-8220
 URL: https://issues.apache.org/jira/browse/SPARK-8220
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 positive(INT a): INT
 positive(DOUBLE a): DOUBLE
 This is really just an identify function. We should create an Identity 
 expression, and then in the optimizer just removes the Identity functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8221) math function: pmod

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579962#comment-14579962
 ] 

zhichao-li commented on SPARK-8221:
---

I will take this one

 math function: pmod
 ---

 Key: SPARK-8221
 URL: https://issues.apache.org/jira/browse/SPARK-8221
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 pmod(INT a, INT b): INT
 pmod(DOUBLE a, DOUBLE b): DOUBLE
 Returns the positive value of a mod b.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8219) math function: negative

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579959#comment-14579959
 ] 

zhichao-li commented on SPARK-8219:
---

I will take this one

 math function: negative
 ---

 Key: SPARK-8219
 URL: https://issues.apache.org/jira/browse/SPARK-8219
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin

 This is just an alias for UnaryMinus. Only add it to FunctionRegistry, and 
 not DataFrame.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8222) math function: alias power / pow

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579963#comment-14579963
 ] 

zhichao-li commented on SPARK-8222:
---

I will take this one

 math function: alias power / pow
 

 Key: SPARK-8222
 URL: https://issues.apache.org/jira/browse/SPARK-8222
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin

 Add to FunctionRegistry power.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8209) math function: conv

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579945#comment-14579945
 ] 

zhichao-li commented on SPARK-8209:
---

I will take this one

 math function: conv
 ---

 Key: SPARK-8209
 URL: https://issues.apache.org/jira/browse/SPARK-8209
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 conv(BIGINT num, INT from_base, INT to_base): string
 conv(STRING num, INT from_base, INT to_base): string
 Converts a number from a given base to another (see 
 http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_conv).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8208) math function: ceiling

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579944#comment-14579944
 ] 

zhichao-li commented on SPARK-8208:
---

I will take this one

 math function: ceiling
 --

 Key: SPARK-8208
 URL: https://issues.apache.org/jira/browse/SPARK-8208
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 We already have ceil -- just need to create an alias for it in 
 FunctionRegistry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8211) math function: radians

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579947#comment-14579947
 ] 

zhichao-li commented on SPARK-8211:
---

I will take this one

 math function: radians
 --

 Key: SPARK-8211
 URL: https://issues.apache.org/jira/browse/SPARK-8211
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin

 Alias toRadians - radians in FunctionRegistry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8213) math function: factorial

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579948#comment-14579948
 ] 

zhichao-li commented on SPARK-8213:
---

I will take this one

 math function: factorial
 

 Key: SPARK-8213
 URL: https://issues.apache.org/jira/browse/SPARK-8213
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 factorial(INT a): long
 Returns the factorial of a (as of Hive 1.2.0). Valid a is [0..20].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8214) math function: hex

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579949#comment-14579949
 ] 

zhichao-li commented on SPARK-8214:
---

I will take this one

 math function: hex
 --

 Key: SPARK-8214
 URL: https://issues.apache.org/jira/browse/SPARK-8214
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 hex(BIGINT a): string
 hex(STRING a): string
 hex(BINARY a): string
 If the argument is an INT or binary, hex returns the number as a STRING in 
 hexadecimal format. Otherwise if the number is a STRING, it converts each 
 character into its hexadecimal representation and returns the resulting 
 STRING. (See 
 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_hex, 
 BINARY version as of Hive 0.12.0.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8210) math function: degrees

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579946#comment-14579946
 ] 

zhichao-li commented on SPARK-8210:
---

I will take this one

 math function: degrees
 --

 Key: SPARK-8210
 URL: https://issues.apache.org/jira/browse/SPARK-8210
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin

 Alias todegrees - degrees.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8216) math function: rename log - ln

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579952#comment-14579952
 ] 

zhichao-li commented on SPARK-8216:
---

I will take this one

 math function: rename log - ln
 ---

 Key: SPARK-8216
 URL: https://issues.apache.org/jira/browse/SPARK-8216
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Assignee: Reynold Xin

 Rename expression Log - Ln.
 Also create aliased data frame functions, and update FunctionRegistry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8224) math function: shiftright

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579965#comment-14579965
 ] 

zhichao-li commented on SPARK-8224:
---

I will take this one

 math function: shiftright
 -

 Key: SPARK-8224
 URL: https://issues.apache.org/jira/browse/SPARK-8224
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 shiftrightunsigned(INT a), shiftrightunsigned(BIGINT a)   
 Bitwise unsigned right shift (as of Hive 1.2.0). Returns int for tinyint, 
 smallint and int a. Returns bigint for bigint a.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8223) math function: shiftleft

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579964#comment-14579964
 ] 

zhichao-li commented on SPARK-8223:
---

I will take this one

 math function: shiftleft
 

 Key: SPARK-8223
 URL: https://issues.apache.org/jira/browse/SPARK-8223
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 shiftleft(INT a)
 shiftleft(BIGINT a)
 Bitwise left shift (as of Hive 1.2.0). Returns int for tinyint, smallint and 
 int a. Returns bigint for bigint a.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8227) math function: unhex

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579969#comment-14579969
 ] 

zhichao-li commented on SPARK-8227:
---

I will take this one

 math function: unhex
 

 Key: SPARK-8227
 URL: https://issues.apache.org/jira/browse/SPARK-8227
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 unhex(STRING a): BINARY
 Inverse of hex. Interprets each pair of characters as a hexadecimal number 
 and converts to the byte representation of the number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8226) math function: shiftrightunsigned

2015-06-09 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579968#comment-14579968
 ] 

zhichao-li commented on SPARK-8226:
---

I will take this one

 math function: shiftrightunsigned
 -

 Key: SPARK-8226
 URL: https://issues.apache.org/jira/browse/SPARK-8226
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 shiftrightunsigned(INT a), shiftrightunsigned(BIGINT a)   
 Bitwise unsigned right shift (as of Hive 1.2.0). Returns int for tinyint, 
 smallint and int a. Returns bigint for bigint a.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-7119) ScriptTransform doesn't consider the output data type

2015-06-04 Thread zhichao-li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhichao-li updated SPARK-7119:
--
Comment: was deleted

(was: This workaround query can be executed correctly and there's a simple fix 
for this issue by the way :))

 ScriptTransform doesn't consider the output data type
 -

 Key: SPARK-7119
 URL: https://issues.apache.org/jira/browse/SPARK-7119
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0, 1.3.1, 1.4.0
Reporter: Cheng Hao

 {code:sql}
 from (from src select transform(key, value) using 'cat' as (thing1 int, 
 thing2 string)) t select thing1 + 2;
 {code}
 {noformat}
 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job 
 aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent 
 failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): 
 java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be 
 cast to java.lang.Integer
   at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
   at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57)
   at 
 org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118)
   at 
 org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
   at 
 org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at 
 scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
   at 
 scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
   at 
 scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
   at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
   at scala.collection.AbstractIterator.to(Iterator.scala:1157)
   at 
 scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
   at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
   at 
 scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
   at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
   at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819)
   at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
   at org.apache.spark.scheduler.Task.run(Task.scala:64)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type

2015-06-04 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573889#comment-14573889
 ] 

zhichao-li commented on SPARK-7119:
---

This workaround query can be executed correctly and there's a simple fix for 
this issue by the way :)

 ScriptTransform doesn't consider the output data type
 -

 Key: SPARK-7119
 URL: https://issues.apache.org/jira/browse/SPARK-7119
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0, 1.3.1, 1.4.0
Reporter: Cheng Hao

 {code:sql}
 from (from src select transform(key, value) using 'cat' as (thing1 int, 
 thing2 string)) t select thing1 + 2;
 {code}
 {noformat}
 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job 
 aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent 
 failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): 
 java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be 
 cast to java.lang.Integer
   at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
   at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57)
   at 
 org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118)
   at 
 org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
   at 
 org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at 
 scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
   at 
 scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
   at 
 scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
   at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
   at scala.collection.AbstractIterator.to(Iterator.scala:1157)
   at 
 scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
   at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
   at 
 scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
   at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
   at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819)
   at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
   at org.apache.spark.scheduler.Task.run(Task.scala:64)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type

2015-06-04 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573888#comment-14573888
 ] 

zhichao-li commented on SPARK-7119:
---

This workaround query can be executed correctly and there's a simple fix for 
this issue by the way :)

 ScriptTransform doesn't consider the output data type
 -

 Key: SPARK-7119
 URL: https://issues.apache.org/jira/browse/SPARK-7119
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0, 1.3.1, 1.4.0
Reporter: Cheng Hao

 {code:sql}
 from (from src select transform(key, value) using 'cat' as (thing1 int, 
 thing2 string)) t select thing1 + 2;
 {code}
 {noformat}
 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job 
 aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent 
 failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): 
 java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be 
 cast to java.lang.Integer
   at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
   at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57)
   at 
 org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127)
   at 
 org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118)
   at 
 org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68)
   at 
 org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at 
 scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
   at 
 scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
   at 
 scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
   at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
   at scala.collection.AbstractIterator.to(Iterator.scala:1157)
   at 
 scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
   at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
   at 
 scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
   at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
   at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819)
   at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618)
   at 
 org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
   at org.apache.spark.scheduler.Task.run(Task.scala:64)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7862) Query would hang when the using script has error output in SparkSQL

2015-05-26 Thread zhichao-li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhichao-li updated SPARK-7862:
--
Description: 
Steps to reproduce:

val data = (1 to 10).map { i = (i, i, i) }
data.toDF(d1, d2, d3).registerTempTable(script_trans)
 sql(SELECT TRANSFORM (d1, d2, d3) USING 'cat 12' AS (a,b,c) FROM 
script_trans)

 Query would hang when the using script has error output in SparkSQL
 ---

 Key: SPARK-7862
 URL: https://issues.apache.org/jira/browse/SPARK-7862
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: zhichao-li

 Steps to reproduce:
 val data = (1 to 10).map { i = (i, i, i) }
 data.toDF(d1, d2, d3).registerTempTable(script_trans)
  sql(SELECT TRANSFORM (d1, d2, d3) USING 'cat 12' AS (a,b,c) FROM 
 script_trans)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7862) Query would hang when the using script has error output in SparkSQL

2015-05-26 Thread zhichao-li (JIRA)
zhichao-li created SPARK-7862:
-

 Summary: Query would hang when the using script has error output 
in SparkSQL
 Key: SPARK-7862
 URL: https://issues.apache.org/jira/browse/SPARK-7862
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: zhichao-li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-6897) Remove volatile from BlockingGenerator.currentBuffer

2015-04-14 Thread zhichao-li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhichao-li closed SPARK-6897.
-
Resolution: Won't Fix

May not have too much benefit for removing the volatile 

 Remove volatile from BlockingGenerator.currentBuffer
 

 Key: SPARK-6897
 URL: https://issues.apache.org/jira/browse/SPARK-6897
 Project: Spark
  Issue Type: Improvement
  Components: Streaming
Reporter: zhichao-li
Priority: Trivial

 It would introduce extra performance overhead if we double use volatile and 
 synchronized to guard the same resource(BlockingGenerator.currentBuffer). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6762) Fix potential resource leaks

2015-04-07 Thread zhichao-li (JIRA)
zhichao-li created SPARK-6762:
-

 Summary: Fix potential resource leaks
 Key: SPARK-6762
 URL: https://issues.apache.org/jira/browse/SPARK-6762
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Reporter: zhichao-li
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6762) Fix potential resource leaks in CheckPoint CheckpointWriter and CheckpointReader

2015-04-07 Thread zhichao-li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhichao-li updated SPARK-6762:
--
Summary: Fix potential resource leaks in CheckPoint CheckpointWriter and 
CheckpointReader  (was: Fix potential resource leaks)

 Fix potential resource leaks in CheckPoint CheckpointWriter and 
 CheckpointReader
 

 Key: SPARK-6762
 URL: https://issues.apache.org/jira/browse/SPARK-6762
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Reporter: zhichao-li
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6762) Fix potential resource leaks in CheckPoint CheckpointWriter and CheckpointReader

2015-04-07 Thread zhichao-li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhichao-li updated SPARK-6762:
--
Description: The close action should be placed within finally block to 
avoid the potential resource leaks

 Fix potential resource leaks in CheckPoint CheckpointWriter and 
 CheckpointReader
 

 Key: SPARK-6762
 URL: https://issues.apache.org/jira/browse/SPARK-6762
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Reporter: zhichao-li
Priority: Minor

 The close action should be placed within finally block to avoid the potential 
 resource leaks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-04-01 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390221#comment-14390221
 ] 

zhichao-li commented on SPARK-6613:
---

[~msoutier] , have you found any solution for this ? or just report the bug?  

 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)

[jira] [Commented] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-04-01 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392102#comment-14392102
 ] 

zhichao-li commented on SPARK-6613:
---

Just trying to understand the issue but it cann't be reproduced on my side. if 
possible could you elaborate on how to reproduce it ? i.e. code snippet or 
steps. 

 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 

[jira] [Commented] (SPARK-6077) Multiple spark streaming tabs on UI when reuse the same sparkcontext

2015-03-01 Thread zhichao-li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342595#comment-14342595
 ] 

zhichao-li commented on SPARK-6077:
---

Yeah. It would fix the SPARK-2463 as well. Almost the same case, although most 
of the comments on that jira is about stopping concurrently running 
StreamingContexts in the same JVM

 Multiple spark streaming tabs on UI when reuse the same sparkcontext
 

 Key: SPARK-6077
 URL: https://issues.apache.org/jira/browse/SPARK-6077
 Project: Spark
  Issue Type: Bug
  Components: Streaming, Web UI
Reporter: zhichao-li
Priority: Minor

 Currently we would create a new streaming tab for each streamingContext even 
 if there's already one on the same sparkContext which would cause duplicate 
 StreamingTab created and none of them is taking effect. 
 snapshot: 
 https://www.dropbox.com/s/t4gd6hqyqo0nivz/bad%20multiple%20streamings.png?dl=0
 How to reproduce:
 1)
 import org.apache.spark.SparkConf
 import org.apache.spark.streaming.{Seconds, StreamingContext}
 import org.apache.spark.storage.StorageLevel
 val ssc = new StreamingContext(sc, Seconds(1))
 val lines = ssc.socketTextStream(localhost, , 
 StorageLevel.MEMORY_AND_DISK_SER)
 val words = lines.flatMap(_.split( ))
 val wordCounts = words.map(x = (x, 1)).reduceByKey(_ + _)
 wordCounts.print()
 ssc.start()
 .
 2)
 ssc.stop(false)
 val ssc = new StreamingContext(sc, Seconds(1))
 val lines = ssc.socketTextStream(localhost, , 
 StorageLevel.MEMORY_AND_DISK_SER)
 val words = lines.flatMap(_.split( ))
 val wordCounts = words.map(x = (x, 1)).reduceByKey(_ + _)
 wordCounts.print()
 ssc.start()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6077) Multiple spark streaming tabs on UI when reuse the same sparkcontext

2015-02-27 Thread zhichao-li (JIRA)
zhichao-li created SPARK-6077:
-

 Summary: Multiple spark streaming tabs on UI when reuse the same 
sparkcontext
 Key: SPARK-6077
 URL: https://issues.apache.org/jira/browse/SPARK-6077
 Project: Spark
  Issue Type: Bug
  Components: Streaming, Web UI
Reporter: zhichao-li
Priority: Minor


Currently we would create a new streaming tab for each streamingContext even if 
there's already one on the same sparkContext which would cause duplicate 
StreamingTab created and none of them is taking effect. 

snapshot: 
https://www.dropbox.com/s/t4gd6hqyqo0nivz/bad%20multiple%20streamings.png?dl=0

How to reproduce:
1)
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.storage.StorageLevel

val ssc = new StreamingContext(sc, Seconds(1))
val lines = ssc.socketTextStream(localhost, , 
StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split( ))
val wordCounts = words.map(x = (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()
.

2)
ssc.stop(false)
val ssc = new StreamingContext(sc, Seconds(1))
val lines = ssc.socketTextStream(localhost, , 
StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split( ))
val wordCounts = words.map(x = (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org