[jira] [Updated] (SPARK-20879) Spark SQL will read Date type column from avro file as Int

2017-05-24 Thread bing huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bing huang updated SPARK-20879:
---
Attachment: 00_0.avro

avro file with 5 columns

> Spark SQL will read Date type column from avro file as Int
> --
>
> Key: SPARK-20879
> URL: https://issues.apache.org/jira/browse/SPARK-20879
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 
> 2.1.1
>Reporter: bing huang
> Attachments: 00_0.avro
>
>
> I was using the following code and attached avro file.
> In the avro file we have 5 columns named (date1, date2, date3, 
> start_date,end_date) with type (string, string, date, string, string) 
> respectively.
> You can use hive to verify the type of columns either by "select * from 
> 00_0.avro" or "describe 00_0.avro".
> The code of spark sql as below, as you can see the 3rd column's result is int.
> val conf = new SparkConf().setAppName("test").setMaster("local[3]")
> val sc = new SparkContext(conf)
> val sqlContext = new SQLContext(sc)
> val df = sqlContext.read.avro("00_0.avro")
> df.show(false)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20879) Spark SQL will read Date type column from avro file as Int

2017-05-24 Thread bing huang (JIRA)
bing huang created SPARK-20879:
--

 Summary: Spark SQL will read Date type column from avro file as Int
 Key: SPARK-20879
 URL: https://issues.apache.org/jira/browse/SPARK-20879
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.1, 2.1.0, 2.0.2, 2.0.1, 2.0.0, 1.6.3, 1.6.2, 1.6.1, 
1.6.0
Reporter: bing huang


I was using the following code and attached avro file.
In the avro file we have 5 columns named (date1, date2, date3, 
start_date,end_date) with type (string, string, date, string, string) 
respectively.
You can use hive to verify the type of columns either by "select * from 
00_0.avro" or "describe 00_0.avro".

The code of spark sql as below, as you can see the 3rd column's result is int.
val conf = new SparkConf().setAppName("test").setMaster("local[3]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.avro("00_0.avro")
df.show(false)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20846) Incorrect posgres sql array column schema inferred from table.

2017-05-24 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024252#comment-16024252
 ] 

Takeshi Yamamuro commented on SPARK-20846:
--

Basically, I think we do not allow nested arrays in JDBC datasource:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L473

Also, It seems it is difficult to handle this case because int[] and int[][] 
has the same metadata in ResultSetMetadata of postgresql JDBC drivers.
Actually, in postgresql shell, they has the same types there (I'm not sure why 
though...)
{code}

postgres=# create table tx(a int[], b int[][]);
CREATE TABLE

postgres=# \d tx
Table "public.tx"
 Column |   Type| Modifiers
+---+---
 a  | integer[] |
 b  | integer[] |
{code}

> Incorrect posgres sql array column schema inferred from table.
> --
>
> Key: SPARK-20846
> URL: https://issues.apache.org/jira/browse/SPARK-20846
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: Stuart Reynolds
>
> When reading a table containing int[][] columns from postgres, the column is 
> inferred as int[] (should be int[][]).
> {code:python}
> from pyspark.sql import SQLContext
> import pandas as pd
> from dataIngest.util.sqlUtil import asSQLAlchemyEngine
> user,password = ..., ...
> url = "postgresql://hostname:5432/dbname"
> url = 'jdbc:'+url
> properties = {'user': user, 'password': password}
> engine = ... sql alchemy engine ...
>  Create pandas df with int[] and int[][]
> df = pd.DataFrame({
> 'a1': [[1,2,None],[1,2,3], None],
> 'b2':  [[[1],[None],[3]], [[1],[2],[3]], None]
> })
> # Store df into postgres as table _dfjunk
> with engine.connect().execution_options(autocommit=True) as con:
> con.execute("""
> DROP TABLE IF EXISTS _dfjunk;
> 
> CREATE TABLE _dfjunk (
>   a1 int[] NULL,
>   b2 int[][] NULL
> );
> """)
> df.to_sql("_dfjunk", con, index=None, if_exists="append")
> # Let's access via spark
> sc = get_spark_context(master="local")
> sqlContext = SQLContext(sc)
> print "pandas DF as spark DF:"
> df = sqlContext.createDataFrame(df)
> df.printSchema()
> df.show()
> df.registerTempTable("df")
> print sqlContext.sql("select * from df").collect()
> ### Export _dfjunk as table df3
> df3 = sqlContext.read.format("jdbc"). \
> option("url", url). \
> option("driver", "org.postgresql.Driver"). \
> option("useUnicode", "true"). \
> option("continueBatchOnError","true"). \
> option("useSSL", "false"). \
> option("user", user). \
> option("password", password). \
> option("dbtable", "_dfjunk").\
> load()
> df3.registerTempTable("df3")
> print "DF inferred from postgres:"
> df3.printSchema()
> df3.show()
> print "DF queried from postgres:"
> df3 = sqlContext.sql("select * from df3")
> df3.printSchema()
> df3.show()
> print df3.collect()
> {code}
> Errors out with:
> pandas DF as spark DF:
> {noformat}
> root
>  |-- a1: array (nullable = true)
>  ||-- element: long (containsNull = true)
>  |-- b2: array (nullable = true)
>  ||-- element: array (containsNull = true)
>  |||-- element: long (containsNull = true)  <<< ** THIS IS 
> CORRECT 
> +++
> |  a1|  b2|
> +++
> |[1, 2, null]|[WrappedArray(1),...|
> |   [1, 2, 3]|[WrappedArray(1),...|
> |null|null|
> +++
> [Row(a1=[1, 2, None], b2=[[1], [None], [3]]), Row(a1=[1, 2, 3], b2=[[1], [2], 
> [3]]), Row(a1=None, b2=None)]
> DF inferred from postgres:
> root
>  |-- a1: array (nullable = true)
>  ||-- element: integer (containsNull = true)
>  |-- b2: array (nullable = true)
>  ||-- element: integer (containsNull = true)<<< ** THIS IS 
> WRONG Is an array of arrays.
> 17/05/22 15:00:39 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
> java.lang.ClassCastException: [Ljava.lang.Integer; cannot be cast to 
> java.lang.Integer
>   at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
>   at 
> org.apache.spark.sql.catalyst.util.GenericArrayData.getInt(GenericArrayData.scala:62)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun

[jira] [Assigned] (SPARK-20640) Make rpc timeout and retry for shuffle registration configurable

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20640:


Assignee: Apache Spark

> Make rpc timeout and retry for shuffle registration configurable
> 
>
> Key: SPARK-20640
> URL: https://issues.apache.org/jira/browse/SPARK-20640
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 2.0.2
>Reporter: Sital Kedia
>Assignee: Apache Spark
>
> Currently the shuffle service registration timeout and retry has been 
> hardcoded (see 
> https://github.com/sitalkedia/spark/blob/master/network/shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleClient.java#L144
>  and 
> https://github.com/sitalkedia/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L197).
>  This works well for small workloads but under heavy workload when the 
> shuffle service is busy transferring large amount of data we see significant 
> delay in responding to the registration request, as a result we often see the 
> executors fail to register with the shuffle service, eventually failing the 
> job. We need to make these two parameters configurable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20640) Make rpc timeout and retry for shuffle registration configurable

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20640:


Assignee: (was: Apache Spark)

> Make rpc timeout and retry for shuffle registration configurable
> 
>
> Key: SPARK-20640
> URL: https://issues.apache.org/jira/browse/SPARK-20640
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 2.0.2
>Reporter: Sital Kedia
>
> Currently the shuffle service registration timeout and retry has been 
> hardcoded (see 
> https://github.com/sitalkedia/spark/blob/master/network/shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleClient.java#L144
>  and 
> https://github.com/sitalkedia/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L197).
>  This works well for small workloads but under heavy workload when the 
> shuffle service is busy transferring large amount of data we see significant 
> delay in responding to the registration request, as a result we often see the 
> executors fail to register with the shuffle service, eventually failing the 
> job. We need to make these two parameters configurable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-20565) Improve the error message for unsupported JDBC types

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li closed SPARK-20565.
---
   Resolution: Fixed
Fix Version/s: 2.3.0

> Improve the error message for unsupported JDBC types
> 
>
> Key: SPARK-20565
> URL: https://issues.apache.org/jira/browse/SPARK-20565
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Xiao Li
>Assignee: Xiao Li
> Fix For: 2.3.0
>
>
> For unsupported data types, we simply output the type number instead of the 
> type name. 
> {noformat}
> java.sql.SQLException: Unsupported type 2014
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20640) Make rpc timeout and retry for shuffle registration configurable

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024217#comment-16024217
 ] 

Apache Spark commented on SPARK-20640:
--

User 'liyichao' has created a pull request for this issue:
https://github.com/apache/spark/pull/18092

> Make rpc timeout and retry for shuffle registration configurable
> 
>
> Key: SPARK-20640
> URL: https://issues.apache.org/jira/browse/SPARK-20640
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 2.0.2
>Reporter: Sital Kedia
>
> Currently the shuffle service registration timeout and retry has been 
> hardcoded (see 
> https://github.com/sitalkedia/spark/blob/master/network/shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleClient.java#L144
>  and 
> https://github.com/sitalkedia/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L197).
>  This works well for small workloads but under heavy workload when the 
> shuffle service is busy transferring large amount of data we see significant 
> delay in responding to the registration request, as a result we often see the 
> executors fail to register with the shuffle service, eventually failing the 
> job. We need to make these two parameters configurable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20565) Improve the error message for unsupported JDBC types

2017-05-24 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024215#comment-16024215
 ] 

Xiao Li commented on SPARK-20565:
-

Fixed in https://github.com/apache/spark/pull/17835

> Improve the error message for unsupported JDBC types
> 
>
> Key: SPARK-20565
> URL: https://issues.apache.org/jira/browse/SPARK-20565
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Xiao Li
>Assignee: Xiao Li
> Fix For: 2.3.0
>
>
> For unsupported data types, we simply output the type number instead of the 
> type name. 
> {noformat}
> java.sql.SQLException: Unsupported type 2014
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20878) Pyspark date string parsing erroneously treats 1 as 10

2017-05-24 Thread Nick Lothian (JIRA)
Nick Lothian created SPARK-20878:


 Summary: Pyspark date string parsing erroneously treats 1 as 10 
 Key: SPARK-20878
 URL: https://issues.apache.org/jira/browse/SPARK-20878
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 2.0.2
Reporter: Nick Lothian


Pyspark date filter columns can take a String in format -mm-dd and 
correctly handle it. This doesn't appear to be documented anywhere (?) but is 
extremely useful. 

However, it silently converts the format -mm-d to -mm-d0 and -m-dd 
to -m0-dd. 

For example, 2017-02-1 will be treated as  2017-02-1, and 2017-2-01 as 
2017-20-01 (which is invalid, but does not throw an error)

This is causes very hard to discover bugs.

Test code:

{code}
from pyspark.sql.types import *
from datetime import datetime

schema = StructType([StructField("label", StringType(), True),\
StructField("date", DateType(), True)]\
   )


data = [('One', datetime.strptime("2017/02/01", '%Y/%m/%d')), 
('Two', datetime.strptime("2017/02/02", '%Y/%m/%d')), 
('Ten', datetime.strptime("2017/02/10", '%Y/%m/%d')),
('Eleven', datetime.strptime("2017/02/11", '%Y/%m/%d'))]

df = sqlContext.createDataFrame(data, schema)
df.printSchema()

print("All Data")
df.show()

print("Filter greater than 1 Jan (using 2017-02-1)")
df.filter(df.date > '2017-02-1').show()


print("Filter greater than 1 Jan (using 2017-02-01)")
df.filter(df.date > '2017-02-01').show()


print("Filter greater than 1 Jan (using 2017-2-01)")
df.filter(df.date > '2017-2-01').show()

{code}

Output:

{code}
root
 |-- label: string (nullable = true)
 |-- date: date (nullable = true)

All Data
+--+--+
| label|  date|
+--+--+
|   One|2017-02-01|
|   Two|2017-02-02|
|   Ten|2017-02-10|
|Eleven|2017-02-11|
+--+--+

Filter greater than 1 Feb (using 2017-02-1)
+--+--+
| label|  date|
+--+--+
|   Ten|2017-02-10|
|Eleven|2017-02-11|
+--+--+

Filter greater than 1 Feb (using 2017-02-01)
+--+--+
| label|  date|
+--+--+
|   Two|2017-02-02|
|   Ten|2017-02-10|
|Eleven|2017-02-11|
+--+--+

Filter greater than 1 Feb (using 2017-2-01)
+-++
|label|date|
+-++
+-++
{code}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20876) if the input parameter is float type for ceil or floor ,the result is not we expected

2017-05-24 Thread liuxian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxian updated SPARK-20876:

Description: 
spark-sql>SELECT ceil(cast(12345.1233 as float));
spark-sql>12345
For this case, the result we expected is 12346

spark-sql>SELECT floor(cast(-12345.1233 as float));
spark-sql>-12345
For this case, the result we expected is  -12346

  was:
spark-sql>SELECT ceil(cast(12345.1233 as float));
spark-sql>12345
For this case, we expected is 12346

spark-sql>SELECT floor(cast(-12345.1233 as float));
spark-sql>-12345
For this case, the result we expected is  -12346


> if the input parameter is float type for  ceil or floor ,the result is not we 
> expected
> --
>
> Key: SPARK-20876
> URL: https://issues.apache.org/jira/browse/SPARK-20876
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>
> spark-sql>SELECT ceil(cast(12345.1233 as float));
> spark-sql>12345
> For this case, the result we expected is 12346
> spark-sql>SELECT floor(cast(-12345.1233 as float));
> spark-sql>-12345
> For this case, the result we expected is  -12346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4131) Support "Writing data into the filesystem from queries"

2017-05-24 Thread Santhavathi S (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024212#comment-16024212
 ] 

Santhavathi S commented on SPARK-4131:
--

Is this feature available yet?

> Support "Writing data into the filesystem from queries"
> ---
>
> Key: SPARK-4131
> URL: https://issues.apache.org/jira/browse/SPARK-4131
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 1.1.0
>Reporter: XiaoJing wang
>Assignee: Fei Wang
>Priority: Critical
>   Original Estimate: 0.05h
>  Remaining Estimate: 0.05h
>
> Writing data into the filesystem from queries,SparkSql is not support .
> eg:
> {code}insert overwrite LOCAL DIRECTORY '/data1/wangxj/sql_spark' select * 
> from page_views;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20876) if the input parameter is float type for ceil ,the result is not we expected

2017-05-24 Thread liuxian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxian updated SPARK-20876:

Description: 
spark-sql>SELECT ceil(cast(12345.1233 as float));
spark-sql>12345
For this case, we expected is 12346

spark-sql>SELECT floor(cast(-12345.1233 as float));
spark-sql>-12345
For this case, the result we expected is  -12346

  was:
spark-sql>SELECT ceil(cast(12345.1233 as float));
spark-sql>12345
For this case, we expected is 12346



> if the input parameter is float type for  ceil ,the result is not we expected
> -
>
> Key: SPARK-20876
> URL: https://issues.apache.org/jira/browse/SPARK-20876
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>
> spark-sql>SELECT ceil(cast(12345.1233 as float));
> spark-sql>12345
> For this case, we expected is 12346
> spark-sql>SELECT floor(cast(-12345.1233 as float));
> spark-sql>-12345
> For this case, the result we expected is  -12346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20876) if the input parameter is float type for ceil or floor ,the result is not we expected

2017-05-24 Thread liuxian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuxian updated SPARK-20876:

Summary: if the input parameter is float type for  ceil or floor ,the 
result is not we expected  (was: if the input parameter is float type for  ceil 
,the result is not we expected)

> if the input parameter is float type for  ceil or floor ,the result is not we 
> expected
> --
>
> Key: SPARK-20876
> URL: https://issues.apache.org/jira/browse/SPARK-20876
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>
> spark-sql>SELECT ceil(cast(12345.1233 as float));
> spark-sql>12345
> For this case, we expected is 12346
> spark-sql>SELECT floor(cast(-12345.1233 as float));
> spark-sql>-12345
> For this case, the result we expected is  -12346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-20873:
---

Assignee: (was: Xiao Li)

> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.Exception: Unsupported type: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20877) Investigate if tests will time out on CRAN

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20877:


Assignee: Apache Spark

> Investigate if tests will time out on CRAN
> --
>
> Key: SPARK-20877
> URL: https://issues.apache.org/jira/browse/SPARK-20877
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>Assignee: Apache Spark
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20877) Investigate if tests will time out on CRAN

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20877:


Assignee: (was: Apache Spark)

> Investigate if tests will time out on CRAN
> --
>
> Key: SPARK-20877
> URL: https://issues.apache.org/jira/browse/SPARK-20877
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20877) Investigate if tests will time out on CRAN

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024159#comment-16024159
 ] 

Apache Spark commented on SPARK-20877:
--

User 'felixcheung' has created a pull request for this issue:
https://github.com/apache/spark/pull/18104

> Investigate if tests will time out on CRAN
> --
>
> Key: SPARK-20877
> URL: https://issues.apache.org/jira/browse/SPARK-20877
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 2.2.0
>Reporter: Felix Cheung
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20877) Investigate if tests will time out on CRAN

2017-05-24 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-20877:


 Summary: Investigate if tests will time out on CRAN
 Key: SPARK-20877
 URL: https://issues.apache.org/jira/browse/SPARK-20877
 Project: Spark
  Issue Type: Sub-task
  Components: SparkR
Affects Versions: 2.2.0
Reporter: Felix Cheung






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20876) if the input parameter is float type for ceil ,the result is not we expected

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20876:


Assignee: (was: Apache Spark)

> if the input parameter is float type for  ceil ,the result is not we expected
> -
>
> Key: SPARK-20876
> URL: https://issues.apache.org/jira/browse/SPARK-20876
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>
> spark-sql>SELECT ceil(cast(12345.1233 as float));
> spark-sql>12345
> For this case, we expected is 12346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20876) if the input parameter is float type for ceil ,the result is not we expected

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20876:


Assignee: Apache Spark

> if the input parameter is float type for  ceil ,the result is not we expected
> -
>
> Key: SPARK-20876
> URL: https://issues.apache.org/jira/browse/SPARK-20876
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>Assignee: Apache Spark
>
> spark-sql>SELECT ceil(cast(12345.1233 as float));
> spark-sql>12345
> For this case, we expected is 12346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20876) if the input parameter is float type for ceil ,the result is not we expected

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024077#comment-16024077
 ] 

Apache Spark commented on SPARK-20876:
--

User '10110346' has created a pull request for this issue:
https://github.com/apache/spark/pull/18103

> if the input parameter is float type for  ceil ,the result is not we expected
> -
>
> Key: SPARK-20876
> URL: https://issues.apache.org/jira/browse/SPARK-20876
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>
> spark-sql>SELECT ceil(cast(12345.1233 as float));
> spark-sql>12345
> For this case, we expected is 12346



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20876) if the input parameter is float type for ceil ,the result is not we expected

2017-05-24 Thread liuxian (JIRA)
liuxian created SPARK-20876:
---

 Summary: if the input parameter is float type for  ceil ,the 
result is not we expected
 Key: SPARK-20876
 URL: https://issues.apache.org/jira/browse/SPARK-20876
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.0
Reporter: liuxian


spark-sql>SELECT ceil(cast(12345.1233 as float));
spark-sql>12345
For this case, we expected is 12346




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20875) Spark should print the log when the directory has been deleted

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20875:


Assignee: Apache Spark

> Spark should print the log when the directory has been deleted
> --
>
> Key: SPARK-20875
> URL: https://issues.apache.org/jira/browse/SPARK-20875
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: liuzhaokun
>Assignee: Apache Spark
>
> When the "deleteRecursively" method is invoked,spark doesn't print any log if 
> the path was deleted.For example,spark only print "Removing directory" when 
> the worker began cleaning spark.work.dir,but didn't print any log about "the 
> path has been delete".So, I can't judge whether  the path was deleted form 
> the worker's logfile,If there is any accidents about Linux.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20875) Spark should print the log when the directory has been deleted

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20875:


Assignee: (was: Apache Spark)

> Spark should print the log when the directory has been deleted
> --
>
> Key: SPARK-20875
> URL: https://issues.apache.org/jira/browse/SPARK-20875
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: liuzhaokun
>
> When the "deleteRecursively" method is invoked,spark doesn't print any log if 
> the path was deleted.For example,spark only print "Removing directory" when 
> the worker began cleaning spark.work.dir,but didn't print any log about "the 
> path has been delete".So, I can't judge whether  the path was deleted form 
> the worker's logfile,If there is any accidents about Linux.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20875) Spark should print the log when the directory has been deleted

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024020#comment-16024020
 ] 

Apache Spark commented on SPARK-20875:
--

User 'liu-zhaokun' has created a pull request for this issue:
https://github.com/apache/spark/pull/18102

> Spark should print the log when the directory has been deleted
> --
>
> Key: SPARK-20875
> URL: https://issues.apache.org/jira/browse/SPARK-20875
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: liuzhaokun
>
> When the "deleteRecursively" method is invoked,spark doesn't print any log if 
> the path was deleted.For example,spark only print "Removing directory" when 
> the worker began cleaning spark.work.dir,but didn't print any log about "the 
> path has been delete".So, I can't judge whether  the path was deleted form 
> the worker's logfile,If there is any accidents about Linux.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20875) Spark should print the log when the directory has been deleted

2017-05-24 Thread liuzhaokun (JIRA)
liuzhaokun created SPARK-20875:
--

 Summary: Spark should print the log when the directory has been 
deleted
 Key: SPARK-20875
 URL: https://issues.apache.org/jira/browse/SPARK-20875
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.1.1
Reporter: liuzhaokun


When the "deleteRecursively" method is invoked,spark doesn't print any log if 
the path was deleted.For example,spark only print "Removing directory" when the 
worker began cleaning spark.work.dir,but didn't print any log about "the path 
has been delete".So, I can't judge whether  the path was deleted form the 
worker's logfile,If there is any accidents about Linux.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-05-24 Thread Rupesh Mane (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023980#comment-16023980
 ] 

Rupesh Mane commented on SPARK-18105:
-

For the stack provided earlier, I found the root cause: Issue is in Executor 
while getting compressed Broadcast variable. I'm specifying 
*spark.io.compression.codec* as *snappy*. So both driver and executors should 
be using this codec to compress and uncompress broadcast variable. But it seems 
like executor is defaulting to LZ4 instead of Snappy.

I'm not seeing this on my dev environment which is on Spark 2.1.1. While I am 
seeing this problem on AWS EMR 5.5.0 which has Spark 2.1.0. Not sure if this is 
related to AWS or Spark.

Thanks - Rupesh



> LZ4 failed to decompress a stream of shuffled data
> --
>
> Key: SPARK-18105
> URL: https://issues.apache.org/jira/browse/SPARK-18105
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Reporter: Davies Liu
>Assignee: Davies Liu
>
> When lz4 is used to compress the shuffle files, it may fail to decompress it 
> as "stream is corrupt"
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 92 in stage 5.0 failed 4 times, most recent failure: Lost task 92.3 in 
> stage 5.0 (TID 16616, 10.0.27.18): java.io.IOException: Stream is corrupted
>   at 
> org.apache.spark.io.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:220)
>   at 
> org.apache.spark.io.LZ4BlockInputStream.available(LZ4BlockInputStream.java:109)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:353)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
>   at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
>   at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
>   at 
> org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:397)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> https://github.com/jpountz/lz4-java/issues/89



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20874) The "examples" project doesn't depend on Structured Streaming Kafka source

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20874:


Assignee: Shixiong Zhu  (was: Apache Spark)

> The "examples" project doesn't depend on Structured Streaming Kafka source
> --
>
> Key: SPARK-20874
> URL: https://issues.apache.org/jira/browse/SPARK-20874
> Project: Spark
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Minor
>
> Right now running `bin/run-example StructuredKafkaWordCount ...` will throw 
> an error saying "kafka" source not found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20874) The "examples" project doesn't depend on Structured Streaming Kafka source

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20874:


Assignee: Apache Spark  (was: Shixiong Zhu)

> The "examples" project doesn't depend on Structured Streaming Kafka source
> --
>
> Key: SPARK-20874
> URL: https://issues.apache.org/jira/browse/SPARK-20874
> Project: Spark
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Shixiong Zhu
>Assignee: Apache Spark
>Priority: Minor
>
> Right now running `bin/run-example StructuredKafkaWordCount ...` will throw 
> an error saying "kafka" source not found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20874) The "examples" project doesn't depend on Structured Streaming Kafka source

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023957#comment-16023957
 ] 

Apache Spark commented on SPARK-20874:
--

User 'zsxwing' has created a pull request for this issue:
https://github.com/apache/spark/pull/18101

> The "examples" project doesn't depend on Structured Streaming Kafka source
> --
>
> Key: SPARK-20874
> URL: https://issues.apache.org/jira/browse/SPARK-20874
> Project: Spark
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Minor
>
> Right now running `bin/run-example StructuredKafkaWordCount ...` will throw 
> an error saying "kafka" source not found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20874) The "examples" project doesn't depend on Structured Streaming Kafka source

2017-05-24 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20874:


 Summary: The "examples" project doesn't depend on Structured 
Streaming Kafka source
 Key: SPARK-20874
 URL: https://issues.apache.org/jira/browse/SPARK-20874
 Project: Spark
  Issue Type: Bug
  Components: Examples
Affects Versions: 2.1.1, 2.1.0
Reporter: Shixiong Zhu
Assignee: Shixiong Zhu
Priority: Minor


Right now running `bin/run-example StructuredKafkaWordCount ...` will throw an 
error saying "kafka" source not found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-20403) It is wrong to the instructions of some functions,such as boolean,tinyint,smallint,int,bigint,float,double,decimal,date,timestamp,binary,string

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-20403.
-
   Resolution: Fixed
 Assignee: liuxian
Fix Version/s: 2.2.0

> It is wrong to the instructions of some functions,such as  
> boolean,tinyint,smallint,int,bigint,float,double,decimal,date,timestamp,binary,string
> 
>
> Key: SPARK-20403
> URL: https://issues.apache.org/jira/browse/SPARK-20403
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 2.1.0
>Reporter: liuxian
>Assignee: liuxian
>Priority: Minor
> Fix For: 2.2.0
>
>
> spark-sql>desc function boolean;
> Function: boolean
> Class: org.apache.spark.sql.catalyst.expressions.Cast
> Usage: boolean(expr AS type) - Casts the value `expr` to the target data type 
> `type`.
> spark-sql>desc function int;
> Function: int
> Class: org.apache.spark.sql.catalyst.expressions.Cast
> Usage: int(expr AS type) - Casts the value `expr` to the target data type 
> `type`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2017-05-24 Thread Wenchen Fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan updated SPARK-18406:

Fix Version/s: 2.1.2

> Race between end-of-task and completion iterator read lock release
> --
>
> Key: SPARK-18406
> URL: https://issues.apache.org/jira/browse/SPARK-18406
> Project: Spark
>  Issue Type: Bug
>  Components: Block Manager, Spark Core
>Affects Versions: 2.0.0, 2.0.1
>Reporter: Josh Rosen
>Assignee: Jiang Xingbo
> Fix For: 2.0.3, 2.1.2, 2.2.0
>
>
> The following log comes from a production streaming job where executors 
> periodically die due to uncaught exceptions during block release:
> {code}
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7921
> 16/11/07 17:11:06 INFO Executor: Running task 0.0 in stage 2390.0 (TID 7921)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7922
> 16/11/07 17:11:06 INFO Executor: Running task 1.0 in stage 2390.0 (TID 7922)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7923
> 16/11/07 17:11:06 INFO Executor: Running task 2.0 in stage 2390.0 (TID 7923)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Started reading broadcast variable 
> 2721
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7924
> 16/11/07 17:11:06 INFO Executor: Running task 3.0 in stage 2390.0 (TID 7924)
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721_piece0 stored as 
> bytes in memory (estimated size 5.0 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Reading broadcast variable 2721 took 
> 3 ms
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721 stored as values in 
> memory (estimated size 9.4 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_3 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_2 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_4 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 2, boot = -566, init = 
> 567, finish = 1
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 7, boot = -540, init = 
> 541, finish = 6
> 16/11/07 17:11:06 INFO Executor: Finished task 2.0 in stage 2390.0 (TID 
> 7923). 1429 bytes result sent to driver
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 8, boot = -532, init = 
> 533, finish = 7
> 16/11/07 17:11:06 INFO Executor: Finished task 3.0 in stage 2390.0 (TID 
> 7924). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Executor: Exception in task 0.0 in stage 2390.0 (TID 
> 7921)
> java.lang.AssertionError: assertion failed
>   at scala.Predef$.assert(Predef.scala:165)
>   at 
> org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:84)
>   at 
> org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:66)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:362)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:361)
>   at scala.Option.foreach(Option.scala:236)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:361)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:356)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:356)
>   at 
> org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7925
> 16/11/07 17:11:06 INFO Executor: Running task 0.1 in stage 2390.0 (TID 7925)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 41, boot = -536, init = 
> 576, finish = 1
> 16/11/07 17:11:06 INFO Executor: Finished task 1.0 in stage 2390.0 (TID 
> 7922). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Utils: Uncaught exception in thread stdout writer for 
> /databricks/python/bin/python
> java.lang.AssertionError: assertion failed: Block rdd_2741_1 is not locked 
> for read

[jira] [Resolved] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-20872.
-
   Resolution: Fixed
Fix Version/s: 2.2.0

> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Assignee: Kris Mok
>Priority: Trivial
>  Labels: easyfix
> Fix For: 2.2.0
>
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator on the Executor side in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.
> On the Executor side, this ShuffleExchange instance is deserialized from the 
> data sent over from the Driver, and because the coordinator field is marked 
> transient, it's not carried over to the Executor, that's why it can be null 
> upon inspection.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reassigned SPARK-20872:
---

Assignee: Kris Mok

> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Assignee: Kris Mok
>Priority: Trivial
>  Labels: easyfix
> Fix For: 2.2.0
>
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator on the Executor side in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.
> On the Executor side, this ShuffleExchange instance is deserialized from the 
> data sent over from the Driver, and because the coordinator field is marked 
> transient, it's not carried over to the Executor, that's why it can be null 
> upon inspection.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-20205) DAGScheduler posts SparkListenerStageSubmitted before updating stage

2017-05-24 Thread Marcelo Vanzin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin resolved SPARK-20205.

   Resolution: Fixed
 Assignee: Marcelo Vanzin
Fix Version/s: 2.3.0

> DAGScheduler posts SparkListenerStageSubmitted before updating stage
> 
>
> Key: SPARK-20205
> URL: https://issues.apache.org/jira/browse/SPARK-20205
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Marcelo Vanzin
>Assignee: Marcelo Vanzin
> Fix For: 2.3.0
>
>
> Probably affects other versions, haven't checked.
> The code that submits the event to the bus is around line 991:
> {code}
> stage.makeNewStageAttempt(partitionsToCompute.size, 
> taskIdToLocations.values.toSeq)
> listenerBus.post(SparkListenerStageSubmitted(stage.latestInfo, 
> properties))
> {code}
> Later in the same method, the stage information is updated (around line 1057):
> {code}
> if (tasks.size > 0) {
>   logInfo(s"Submitting ${tasks.size} missing tasks from $stage 
> (${stage.rdd}) (first 15 " +
> s"tasks are for partitions ${tasks.take(15).map(_.partitionId)})")
>   taskScheduler.submitTasks(new TaskSet(
> tasks.toArray, stage.id, stage.latestInfo.attemptId, jobId, 
> properties))
>   stage.latestInfo.submissionTime = Some(clock.getTimeMillis())
> {code}
> That means an event handler might get a stage submitted event with an unset 
> submission time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20848) Dangling threads when reading parquet files in local mode

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023863#comment-16023863
 ] 

Apache Spark commented on SPARK-20848:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/18100

> Dangling threads when reading parquet files in local mode
> -
>
> Key: SPARK-20848
> URL: https://issues.apache.org/jira/browse/SPARK-20848
> Project: Spark
>  Issue Type: Bug
>  Components: Input/Output, SQL
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Nick Pritchard
>Assignee: Liang-Chi Hsieh
> Fix For: 2.1.2, 2.2.0
>
> Attachments: Screen Shot 2017-05-22 at 4.13.52 PM.png
>
>
> On each call to {{spark.read.parquet}}, a new ForkJoinPool is created. One of 
> the threads in the pool is kept in the {{WAITING}} state, and never stopped, 
> which leads to unbounded growth in number of threads.
> This behavior is a regression from v2.1.0.
> Reproducible example:
> {code}
> val spark = SparkSession
>   .builder()
>   .appName("test")
>   .master("local")
>   .getOrCreate()
> while(true) {
>   spark.read.parquet("/path/to/file")
>   Thread.sleep(5000)
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20589) Allow limiting task concurrency per stage

2017-05-24 Thread Mark Nelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023769#comment-16023769
 ] 

Mark Nelson commented on SPARK-20589:
-

I would find this very useful.  We're currently using coalesce to limit the 
simultaneous tasks in a stage that is querying Cassandra. But this gives us 
huge partitions.  If a query timeout causes a task to fail we can lose a lot of 
work.  And the chances of an individual task failing 4 times is high, killing 
the entire job.  Ideally I would like the stage to have a large number of 
partitions, but limit the number of simultaneous tasks for this one stage.

> Allow limiting task concurrency per stage
> -
>
> Key: SPARK-20589
> URL: https://issues.apache.org/jira/browse/SPARK-20589
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.1.0
>Reporter: Thomas Graves
>
> It would be nice to have the ability to limit the number of concurrent tasks 
> per stage.  This is useful when your spark job might be accessing another 
> service and you don't want to DOS that service.  For instance Spark writing 
> to hbase or Spark doing http puts on a service.  Many times you want to do 
> this without limiting the number of partitions. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023759#comment-16023759
 ] 

Apache Spark commented on SPARK-18406:
--

User 'jiangxb1987' has created a pull request for this issue:
https://github.com/apache/spark/pull/18099

> Race between end-of-task and completion iterator read lock release
> --
>
> Key: SPARK-18406
> URL: https://issues.apache.org/jira/browse/SPARK-18406
> Project: Spark
>  Issue Type: Bug
>  Components: Block Manager, Spark Core
>Affects Versions: 2.0.0, 2.0.1
>Reporter: Josh Rosen
>Assignee: Jiang Xingbo
> Fix For: 2.0.3, 2.2.0
>
>
> The following log comes from a production streaming job where executors 
> periodically die due to uncaught exceptions during block release:
> {code}
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7921
> 16/11/07 17:11:06 INFO Executor: Running task 0.0 in stage 2390.0 (TID 7921)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7922
> 16/11/07 17:11:06 INFO Executor: Running task 1.0 in stage 2390.0 (TID 7922)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7923
> 16/11/07 17:11:06 INFO Executor: Running task 2.0 in stage 2390.0 (TID 7923)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Started reading broadcast variable 
> 2721
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7924
> 16/11/07 17:11:06 INFO Executor: Running task 3.0 in stage 2390.0 (TID 7924)
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721_piece0 stored as 
> bytes in memory (estimated size 5.0 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Reading broadcast variable 2721 took 
> 3 ms
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721 stored as values in 
> memory (estimated size 9.4 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_3 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_2 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_4 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 2, boot = -566, init = 
> 567, finish = 1
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 7, boot = -540, init = 
> 541, finish = 6
> 16/11/07 17:11:06 INFO Executor: Finished task 2.0 in stage 2390.0 (TID 
> 7923). 1429 bytes result sent to driver
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 8, boot = -532, init = 
> 533, finish = 7
> 16/11/07 17:11:06 INFO Executor: Finished task 3.0 in stage 2390.0 (TID 
> 7924). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Executor: Exception in task 0.0 in stage 2390.0 (TID 
> 7921)
> java.lang.AssertionError: assertion failed
>   at scala.Predef$.assert(Predef.scala:165)
>   at 
> org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:84)
>   at 
> org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:66)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:362)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:361)
>   at scala.Option.foreach(Option.scala:236)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:361)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:356)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:356)
>   at 
> org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7925
> 16/11/07 17:11:06 INFO Executor: Running task 0.1 in stage 2390.0 (TID 7925)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 41, boot = -536, init = 
> 576, finish = 1
> 16/11/07 17:11:06 INFO Executor: Finished task 1.0 in stage 2390.0 (TID 
> 7922). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Utils: Uncaught exception in thread stdout

[jira] [Updated] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-18406:

Fix Version/s: 2.0.3

> Race between end-of-task and completion iterator read lock release
> --
>
> Key: SPARK-18406
> URL: https://issues.apache.org/jira/browse/SPARK-18406
> Project: Spark
>  Issue Type: Bug
>  Components: Block Manager, Spark Core
>Affects Versions: 2.0.0, 2.0.1
>Reporter: Josh Rosen
>Assignee: Jiang Xingbo
> Fix For: 2.0.3, 2.2.0
>
>
> The following log comes from a production streaming job where executors 
> periodically die due to uncaught exceptions during block release:
> {code}
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7921
> 16/11/07 17:11:06 INFO Executor: Running task 0.0 in stage 2390.0 (TID 7921)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7922
> 16/11/07 17:11:06 INFO Executor: Running task 1.0 in stage 2390.0 (TID 7922)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7923
> 16/11/07 17:11:06 INFO Executor: Running task 2.0 in stage 2390.0 (TID 7923)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Started reading broadcast variable 
> 2721
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7924
> 16/11/07 17:11:06 INFO Executor: Running task 3.0 in stage 2390.0 (TID 7924)
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721_piece0 stored as 
> bytes in memory (estimated size 5.0 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Reading broadcast variable 2721 took 
> 3 ms
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721 stored as values in 
> memory (estimated size 9.4 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_3 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_2 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_4 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 2, boot = -566, init = 
> 567, finish = 1
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 7, boot = -540, init = 
> 541, finish = 6
> 16/11/07 17:11:06 INFO Executor: Finished task 2.0 in stage 2390.0 (TID 
> 7923). 1429 bytes result sent to driver
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 8, boot = -532, init = 
> 533, finish = 7
> 16/11/07 17:11:06 INFO Executor: Finished task 3.0 in stage 2390.0 (TID 
> 7924). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Executor: Exception in task 0.0 in stage 2390.0 (TID 
> 7921)
> java.lang.AssertionError: assertion failed
>   at scala.Predef$.assert(Predef.scala:165)
>   at 
> org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:84)
>   at 
> org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:66)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:362)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:361)
>   at scala.Option.foreach(Option.scala:236)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:361)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:356)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:356)
>   at 
> org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7925
> 16/11/07 17:11:06 INFO Executor: Running task 0.1 in stage 2390.0 (TID 7925)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 41, boot = -536, init = 
> 576, finish = 1
> 16/11/07 17:11:06 INFO Executor: Finished task 1.0 in stage 2390.0 (TID 
> 7922). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Utils: Uncaught exception in thread stdout writer for 
> /databricks/python/bin/python
> java.lang.AssertionError: assertion failed: Block rdd_2741_1 is not locked 
> for reading
>   at 

[jira] [Assigned] (SPARK-16944) [MESOS] Improve data locality when launching new executors when dynamic allocation is enabled

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-16944:


Assignee: (was: Apache Spark)

> [MESOS] Improve data locality when launching new executors when dynamic 
> allocation is enabled
> -
>
> Key: SPARK-16944
> URL: https://issues.apache.org/jira/browse/SPARK-16944
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.0.0
>Reporter: Sun Rui
>
> Currently Spark on Yarn supports better data locality by considering the 
> preferred locations of the pending tasks when dynamic allocation is enabled. 
> Refer to https://issues.apache.org/jira/browse/SPARK-4352. It would be better 
> that Mesos can also support this feature.
> I guess that some logic existing in Yarn could be reused by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-16944) [MESOS] Improve data locality when launching new executors when dynamic allocation is enabled

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-16944:


Assignee: Apache Spark

> [MESOS] Improve data locality when launching new executors when dynamic 
> allocation is enabled
> -
>
> Key: SPARK-16944
> URL: https://issues.apache.org/jira/browse/SPARK-16944
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.0.0
>Reporter: Sun Rui
>Assignee: Apache Spark
>
> Currently Spark on Yarn supports better data locality by considering the 
> preferred locations of the pending tasks when dynamic allocation is enabled. 
> Refer to https://issues.apache.org/jira/browse/SPARK-4352. It would be better 
> that Mesos can also support this feature.
> I guess that some logic existing in Yarn could be reused by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16944) [MESOS] Improve data locality when launching new executors when dynamic allocation is enabled

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023729#comment-16023729
 ] 

Apache Spark commented on SPARK-16944:
--

User 'gpang' has created a pull request for this issue:
https://github.com/apache/spark/pull/18098

> [MESOS] Improve data locality when launching new executors when dynamic 
> allocation is enabled
> -
>
> Key: SPARK-16944
> URL: https://issues.apache.org/jira/browse/SPARK-16944
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.0.0
>Reporter: Sun Rui
>
> Currently Spark on Yarn supports better data locality by considering the 
> preferred locations of the pending tasks when dynamic allocation is enabled. 
> Refer to https://issues.apache.org/jira/browse/SPARK-4352. It would be better 
> that Mesos can also support this feature.
> I guess that some logic existing in Yarn could be reused by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Kris Mok (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kris Mok updated SPARK-20872:
-
Description: 
A ShuffleExchange's coordinator can be null sometimes, and when we need to do a 
toString() on it, it'll go to ShuffleExchange.nodeName() and throw a MatchError 
there because of inexhaustive match -- the match only handles Some and None, 
but doesn't handle null.

An example of this issue is when trying to inspect a Catalyst physical operator 
on the Executor side in an IDE:
{code:none}
child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
exception. Cannot evaluate 
org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
{code}
where this WholeStageCodegenExec transitively references a ShuffleExchange.
On the Executor side, this ShuffleExchange instance is deserialized from the 
data sent over from the Driver, and because the coordinator field is marked 
transient, it's not carried over to the Executor, that's why it can be null 
upon inspection.

  was:
A ShuffleExchange's coordinator can be null sometimes, and when we need to do a 
toString() on it, it'll go to ShuffleExchange.nodeName() and throw a MatchError 
there because of inexhaustive match -- the match only handles Some and None, 
but doesn't handle null.

An example of this issue is when trying to inspect a Catalyst physical operator 
in an IDE:
{code:none}
child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
exception. Cannot evaluate 
org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
{code}
where this WholeStageCodegenExec transitively references a ShuffleExchange.


> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Priority: Trivial
>  Labels: easyfix
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator on the Executor side in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.
> On the Executor side, this ShuffleExchange instance is deserialized from the 
> data sent over from the Driver, and because the coordinator field is marked 
> transient, it's not carried over to the Executor, that's why it can be null 
> upon inspection.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023605#comment-16023605
 ] 

Apache Spark commented on SPARK-20873:
--

User 'setjet' has created a pull request for this issue:
https://github.com/apache/spark/pull/18097

> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>Assignee: Xiao Li
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.Exception: Unsupported type: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20873:


Assignee: Apache Spark  (was: Xiao Li)

> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>Assignee: Apache Spark
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.Exception: Unsupported type: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20873:


Assignee: Xiao Li  (was: Apache Spark)

> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>Assignee: Xiao Li
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.Exception: Unsupported type: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Ruben Janssen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Janssen updated SPARK-20873:
--
Description: 
For unsupported column type, we simply output the column type instead of the 
type name. 

{noformat}
java.lang.Exception: Unsupported type: 
org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
{noformat}

We should improve it by outputting its name.


  was:
For unsupported column type, we simply output the column type instead of the 
type name. 

{noformat}
java.lang.Exception: 
org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
{noformat}

We should improve it by outputting its name.



> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>Assignee: Xiao Li
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.Exception: Unsupported type: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Ruben Janssen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Janssen updated SPARK-20873:
--
Description: 
For unsupported column type, we simply output the column type instead of the 
type name. 

{noformat}
java.lang.Exception: 
org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
{noformat}

We should improve it by outputting its name.


  was:
For unsupported column type, we simply output the column type instead of the 
type name. 

{noformat}
java.lang.SQLException: 
org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
{noformat}

We should improve it by outputting its name.



> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>Assignee: Xiao Li
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.Exception: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Ruben Janssen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Janssen updated SPARK-20873:
--
Description: 
For unsupported column type, we simply output the column type instead of the 
type name. 

{noformat}
java.lang.SQLException: 
org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
{noformat}

We should improve it by outputting its name.


  was:
For unsupported data types, we simply output the type number instead of the 
type name. 

{noformat}
java.sql.SQLException: Unsupported type 2014
{noformat}

We should improve it by outputting its name.



> Improve the error message for unsupported Column Type
> -
>
> Key: SPARK-20873
> URL: https://issues.apache.org/jira/browse/SPARK-20873
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Ruben Janssen
>Assignee: Xiao Li
>
> For unsupported column type, we simply output the column type instead of the 
> type name. 
> {noformat}
> java.lang.SQLException: 
> org.apache.spark.sql.execution.columnar.ColumnTypeSuite$$anonfun$4$$anon$1@2205a05d
> {noformat}
> We should improve it by outputting its name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20873) Improve the error message for unsupported Column Type

2017-05-24 Thread Ruben Janssen (JIRA)
Ruben Janssen created SPARK-20873:
-

 Summary: Improve the error message for unsupported Column Type
 Key: SPARK-20873
 URL: https://issues.apache.org/jira/browse/SPARK-20873
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Ruben Janssen
Assignee: Xiao Li


For unsupported data types, we simply output the type number instead of the 
type name. 

{noformat}
java.sql.SQLException: Unsupported type 2014
{noformat}

We should improve it by outputting its name.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023462#comment-16023462
 ] 

Apache Spark commented on SPARK-18406:
--

User 'jiangxb1987' has created a pull request for this issue:
https://github.com/apache/spark/pull/18096

> Race between end-of-task and completion iterator read lock release
> --
>
> Key: SPARK-18406
> URL: https://issues.apache.org/jira/browse/SPARK-18406
> Project: Spark
>  Issue Type: Bug
>  Components: Block Manager, Spark Core
>Affects Versions: 2.0.0, 2.0.1
>Reporter: Josh Rosen
>Assignee: Jiang Xingbo
> Fix For: 2.2.0
>
>
> The following log comes from a production streaming job where executors 
> periodically die due to uncaught exceptions during block release:
> {code}
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7921
> 16/11/07 17:11:06 INFO Executor: Running task 0.0 in stage 2390.0 (TID 7921)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7922
> 16/11/07 17:11:06 INFO Executor: Running task 1.0 in stage 2390.0 (TID 7922)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7923
> 16/11/07 17:11:06 INFO Executor: Running task 2.0 in stage 2390.0 (TID 7923)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Started reading broadcast variable 
> 2721
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7924
> 16/11/07 17:11:06 INFO Executor: Running task 3.0 in stage 2390.0 (TID 7924)
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721_piece0 stored as 
> bytes in memory (estimated size 5.0 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO TorrentBroadcast: Reading broadcast variable 2721 took 
> 3 ms
> 16/11/07 17:11:06 INFO MemoryStore: Block broadcast_2721 stored as values in 
> memory (estimated size 9.4 KB, free 4.9 GB)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_3 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_2 locally
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_4 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 2, boot = -566, init = 
> 567, finish = 1
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 7, boot = -540, init = 
> 541, finish = 6
> 16/11/07 17:11:06 INFO Executor: Finished task 2.0 in stage 2390.0 (TID 
> 7923). 1429 bytes result sent to driver
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 8, boot = -532, init = 
> 533, finish = 7
> 16/11/07 17:11:06 INFO Executor: Finished task 3.0 in stage 2390.0 (TID 
> 7924). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Executor: Exception in task 0.0 in stage 2390.0 (TID 
> 7921)
> java.lang.AssertionError: assertion failed
>   at scala.Predef$.assert(Predef.scala:165)
>   at 
> org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:84)
>   at 
> org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:66)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:362)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2$$anonfun$apply$2.apply(BlockInfoManager.scala:361)
>   at scala.Option.foreach(Option.scala:236)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:361)
>   at 
> org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$2.apply(BlockInfoManager.scala:356)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:356)
>   at 
> org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:646)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:281)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 16/11/07 17:11:06 INFO CoarseGrainedExecutorBackend: Got assigned task 7925
> 16/11/07 17:11:06 INFO Executor: Running task 0.1 in stage 2390.0 (TID 7925)
> 16/11/07 17:11:06 INFO BlockManager: Found block rdd_2741_1 locally
> 16/11/07 17:11:06 INFO PythonRunner: Times: total = 41, boot = -536, init = 
> 576, finish = 1
> 16/11/07 17:11:06 INFO Executor: Finished task 1.0 in stage 2390.0 (TID 
> 7922). 1429 bytes result sent to driver
> 16/11/07 17:11:06 ERROR Utils: Uncaught exception in thread stdout writer

[jira] [Closed] (SPARK-6000) Batch K-Means clusters should support "mini-batch" updates

2017-05-24 Thread Nick Pentreath (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Pentreath closed SPARK-6000.
-
Resolution: Duplicate

> Batch K-Means clusters should support "mini-batch" updates
> --
>
> Key: SPARK-6000
> URL: https://issues.apache.org/jira/browse/SPARK-6000
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.2.1
>Reporter: Derrick Burns
>Priority: Minor
>
> One of the ways of improving the performance of the K-means clustering 
> algorithm is to sample the points on each round of the Lloyd's algorithm and 
> to only use those samples to update the cluster centers.  (Note that this is 
> similar to the update algorithm of streaming K-means.)  The Spark K-Means 
> clusterer should support the mini-batch algorithm for large data sets. 
> The K-Means implementation at 
> https://github.com/derrickburns/generalized-kmeans-clustering supports the 
> mini-batch algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14174) Accelerate KMeans via Mini-Batch EM

2017-05-24 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023454#comment-16023454
 ] 

Nick Pentreath commented on SPARK-14174:


It makes sense. However, I think k=100 is perhaps less common than say, the 
range k=5 to k=10 or k=20. Of course it depends on the dimensionality and the 
problem domain. Would it be possible to update the performance numbers with 
k-5, k=10, k=20?

> Accelerate KMeans via Mini-Batch EM
> ---
>
> Key: SPARK-14174
> URL: https://issues.apache.org/jira/browse/SPARK-14174
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Reporter: zhengruifeng
> Attachments: MiniBatchKMeans_Performance.pdf
>
>
> The MiniBatchKMeans is a variant of the KMeans algorithm which uses 
> mini-batches to reduce the computation time, while still attempting to 
> optimise the same objective function. Mini-batches are subsets of the input 
> data, randomly sampled in each training iteration. These mini-batches 
> drastically reduce the amount of computation required to converge to a local 
> solution. In contrast to other algorithms that reduce the convergence time of 
> k-means, mini-batch k-means produces results that are generally only slightly 
> worse than the standard algorithm.
> I have implemented mini-batch kmeans in Mllib, and the acceleration is realy 
> significant.
> The MiniBatch KMeans is named XMeans in following lines.
> {code}
> val path = "/tmp/mnist8m.scale"
> val data = MLUtils.loadLibSVMFile(sc, path)
> val vecs = data.map(_.features).persist()
> val km = KMeans.train(data=vecs, k=10, maxIterations=10, runs=1, 
> initializationMode="k-means||", seed=123l)
> km.computeCost(vecs)
> res0: Double = 3.317029898599564E8
> val xm = XMeans.train(data=vecs, k=10, maxIterations=10, runs=1, 
> initializationMode="k-means||", miniBatchFraction=0.1, seed=123l)
> xm.computeCost(vecs)
> res1: Double = 3.3169865959604424E8
> val xm2 = XMeans.train(data=vecs, k=10, maxIterations=10, runs=1, 
> initializationMode="k-means||", miniBatchFraction=0.01, seed=123l)
> xm2.computeCost(vecs)
> res2: Double = 3.317195831216454E8
> {code}
> The above three training all reached the max number of iterations 10.
> We can see that the WSSSEs are almost the same. While their speed perfermence 
> have significant difference:
> {code}
> KMeans2876sec
> MiniBatch KMeans (fraction=0.1) 263sec
> MiniBatch KMeans (fraction=0.01)   90sec
> {code}
> With appropriate fraction, the bigger the dataset is, the higher speedup is.
> The data used above have 8,100,000 samples, 784 features. It can be 
> downloaded here 
> (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/mnist8m.scale.bz2)
> Comparison of the K-Means and MiniBatchKMeans on sklearn : 
> http://scikit-learn.org/stable/auto_examples/cluster/plot_mini_batch_kmeans.html#example-cluster-plot-mini-batch-kmeans-py



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20872:


Assignee: Apache Spark

> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Assignee: Apache Spark
>Priority: Trivial
>  Labels: easyfix
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20872:


Assignee: (was: Apache Spark)

> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Priority: Trivial
>  Labels: easyfix
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023440#comment-16023440
 ] 

Apache Spark commented on SPARK-20872:
--

User 'rednaxelafx' has created a pull request for this issue:
https://github.com/apache/spark/pull/18095

> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Priority: Trivial
>  Labels: easyfix
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20866) Dataset map does not respect nullable field

2017-05-24 Thread Kazuaki Ishizaki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023435#comment-16023435
 ] 

Kazuaki Ishizaki commented on SPARK-20866:
--

I confirmed that it has been fixed in master branch. Am I correct?
If I am correct, we need to find the JIRA entry for this fix.

> Dataset map does not respect nullable field 
> 
>
> Key: SPARK-20866
> URL: https://issues.apache.org/jira/browse/SPARK-20866
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: Colin Breame
>
> The Dataset.map does not respect the nullable fields within the schema. 
> *Test code:*
> (run on spark-shell 2.1.0):
> {code}
> scala> case class Test(a: Int)
> defined class Test
> scala> val ds1 = (Test(10) :: Nil).toDS
> ds1: org.apache.spark.sql.Dataset[Test] = [a: int]
> scala> val ds2 = ds1.map(x => Test(x.a))
> ds2: org.apache.spark.sql.Dataset[Test] = [a: int]
> scala> ds1.schema == ds2.schema
> res65: Boolean = false
> scala> ds1.schema
> res62: org.apache.spark.sql.types.StructType = 
> StructType(StructField(a,IntegerType,false))
> scala> ds2.schema
> res63: org.apache.spark.sql.types.StructType = 
> StructType(StructField(a,IntegerType,true))
> {code}
> *Expected*
> The ds1 should equal ds2. i.e. the schema should be the same.
> *Actual*
> The schema is not equal - the StructField nullable property is true in ds2 
> and false in ds1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Kris Mok (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023421#comment-16023421
 ] 

Kris Mok commented on SPARK-20872:
--

The said matching logic in ShuffleExchange.nodeName() is introduced from 
SPARK-9858.

> ShuffleExchange.nodeName should handle null coordinator
> ---
>
> Key: SPARK-20872
> URL: https://issues.apache.org/jira/browse/SPARK-20872
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Kris Mok
>Priority: Trivial
>  Labels: easyfix
>
> A ShuffleExchange's coordinator can be null sometimes, and when we need to do 
> a toString() on it, it'll go to ShuffleExchange.nodeName() and throw a 
> MatchError there because of inexhaustive match -- the match only handles Some 
> and None, but doesn't handle null.
> An example of this issue is when trying to inspect a Catalyst physical 
> operator in an IDE:
> {code:none}
> child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
> exception. Cannot evaluate 
> org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
> {code}
> where this WholeStageCodegenExec transitively references a ShuffleExchange.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20872) ShuffleExchange.nodeName should handle null coordinator

2017-05-24 Thread Kris Mok (JIRA)
Kris Mok created SPARK-20872:


 Summary: ShuffleExchange.nodeName should handle null coordinator
 Key: SPARK-20872
 URL: https://issues.apache.org/jira/browse/SPARK-20872
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Kris Mok
Priority: Trivial


A ShuffleExchange's coordinator can be null sometimes, and when we need to do a 
toString() on it, it'll go to ShuffleExchange.nodeName() and throw a MatchError 
there because of inexhaustive match -- the match only handles Some and None, 
but doesn't handle null.

An example of this issue is when trying to inspect a Catalyst physical operator 
in an IDE:
{code:none}
child = {WholeStageCodegenExec@13881} Method threw 'scala.MatchError' 
exception. Cannot evaluate 
org.apache.spark.sql.execution.WholeStageCodegenExec.toString()
{code}
where this WholeStageCodegenExec transitively references a ShuffleExchange.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20775) from_json should also have an API where the schema is specified with a string

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20775:


Assignee: Apache Spark

> from_json should also have an API where the schema is specified with a string
> -
>
> Key: SPARK-20775
> URL: https://issues.apache.org/jira/browse/SPARK-20775
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Burak Yavuz
>Assignee: Apache Spark
>
> Right now you also have to provide a java.util.Map which is not nice for 
> Scala users.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20865) caching dataset throws "Queries with streaming sources must be executed with writeStream.start()"

2017-05-24 Thread Michael Armbrust (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust updated SPARK-20865:
-
Description: 
{code}
SparkSession
  .builder
  .master("local[*]")
  .config("spark.sql.warehouse.dir", "C:/tmp/spark")
  .config("spark.sql.streaming.checkpointLocation", 
"C:/tmp/spark/spark-checkpoint")
  .appName("my-test")
  .getOrCreate
  .readStream
  .schema(schema)
  .json("src/test/data")
  .cache
  .writeStream
  .start
  .awaitTermination
{code}

While executing this sample in spark got error. Without the .cache option it 
worked as intended but with .cache option i got:

{code}
Exception in thread "main" org.apache.spark.sql.AnalysisException: Queries with 
streaming sources must be executed with writeStream.start();; 
FileSource[src/test/data] at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:196)
 at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:35)
 at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:33)
 at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128) 
at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForBatch(UnsupportedOperationChecker.scala:33)
 at 
org.apache.spark.sql.execution.QueryExecution.assertSupported(QueryExecution.scala:58)
 at 
org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:69)
 at 
org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:67)
 at 
org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:73)
 at 
org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:73)
 at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:79)
 at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:75)
 at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:84)
 at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:84)
 at 
org.apache.spark.sql.execution.CacheManager$$anonfun$cacheQuery$1.apply(CacheManager.scala:102)
 at 
org.apache.spark.sql.execution.CacheManager.writeLock(CacheManager.scala:65) at 
org.apache.spark.sql.execution.CacheManager.cacheQuery(CacheManager.scala:89) 
at org.apache.spark.sql.Dataset.persist(Dataset.scala:2479) at 
org.apache.spark.sql.Dataset.cache(Dataset.scala:2489) at 
org.me.App$.main(App.scala:23) at org.me.App.main(App.scala)
{code}

  was:
SparkSession
  .builder
  .master("local[*]")
  .config("spark.sql.warehouse.dir", "C:/tmp/spark")
  .config("spark.sql.streaming.checkpointLocation", 
"C:/tmp/spark/spark-checkpoint")
  .appName("my-test")
  .getOrCreate
  .readStream
  .schema(schema)
  .json("src/test/data")
  .cache
  .writeStream
  .start
  .awaitTermination

While executing this sample in spark got error. Without the .cache option it 
worked as intended but with .cache option i got:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Queries with 
streaming sources must be executed with writeStream.start();; 
FileSource[src/test/data] at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:196)
 at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:35)
 at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForBatch$1.apply(UnsupportedOperationChecker.scala:33)
 at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128) 
at 
org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForBatch(UnsupportedOperationChecker.scala:33)
 at 
org.apache.spark.sql.execution.QueryExecution.assertSupported(QueryExecution.scala:58)
 at 
org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:69)
 at 
org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:67)
 at 
org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:73)
 at 
org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:73)
 at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:79)
 at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:75)
 at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:84)
 at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(Query

[jira] [Assigned] (SPARK-20775) from_json should also have an API where the schema is specified with a string

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20775:


Assignee: (was: Apache Spark)

> from_json should also have an API where the schema is specified with a string
> -
>
> Key: SPARK-20775
> URL: https://issues.apache.org/jira/browse/SPARK-20775
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Burak Yavuz
>
> Right now you also have to provide a java.util.Map which is not nice for 
> Scala users.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20775) from_json should also have an API where the schema is specified with a string

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023402#comment-16023402
 ] 

Apache Spark commented on SPARK-20775:
--

User 'setjet' has created a pull request for this issue:
https://github.com/apache/spark/pull/18094

> from_json should also have an API where the schema is specified with a string
> -
>
> Key: SPARK-20775
> URL: https://issues.apache.org/jira/browse/SPARK-20775
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Burak Yavuz
>
> Right now you also have to provide a java.util.Map which is not nice for 
> Scala users.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4899) Support Mesos features: roles and checkpoints

2017-05-24 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023383#comment-16023383
 ] 

Michael Gummelt commented on SPARK-4899:


Thanks Kamal.  I responded to the thread, which I'll copy here:

bq. Restarting the agent without checkpointing enabled will kill the executor, 
but that still shouldn't cause the Spark job to fail, since Spark jobs should 
tolerate executor failures.

So I'm fine with adding checkpointing support, but I'm not sure it actually 
solves any problem.

> Support Mesos features: roles and checkpoints
> -
>
> Key: SPARK-4899
> URL: https://issues.apache.org/jira/browse/SPARK-4899
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 1.2.0
>Reporter: Andrew Ash
>
> Inspired by https://github.com/apache/spark/pull/60
> Mesos has two features that would be nice for Spark to take advantage of:
> 1. Roles -- a way to specify ACLs and priorities for users
> 2. Checkpoints -- a way to restart a failed Mesos slave without losing all 
> the work that was happening on the box
> Some of these may require a Mesos upgrade past our current 0.18.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20815) NullPointerException in RPackageUtils#checkManifestForR

2017-05-24 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023343#comment-16023343
 ] 

Felix Cheung commented on SPARK-20815:
--

Thanks Sean

> NullPointerException in RPackageUtils#checkManifestForR
> ---
>
> Key: SPARK-20815
> URL: https://issues.apache.org/jira/browse/SPARK-20815
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.1.1
>Reporter: Andrew Ash
>Assignee: James Shuster
> Fix For: 2.2.0, 2.3.0
>
>
> Some jars don't have manifest files in them, such as in my case 
> javax.inject-1.jar and value-2.2.1-annotations.jar
> This causes the below NPE:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.spark.deploy.RPackageUtils$.checkManifestForR(RPackageUtils.scala:95)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1$$anonfun$apply$1.apply$mcV$sp(RPackageUtils.scala:180)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1$$anonfun$apply$1.apply(RPackageUtils.scala:180)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1$$anonfun$apply$1.apply(RPackageUtils.scala:180)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1322)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1.apply(RPackageUtils.scala:202)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1.apply(RPackageUtils.scala:175)
> at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at 
> org.apache.spark.deploy.RPackageUtils$.checkAndBuildRPackage(RPackageUtils.scala:175)
> at 
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:311)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:152)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:118)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {noformat}
> due to RPackageUtils#checkManifestForR assuming {{jar.getManifest}} is 
> non-null.
> However per the JDK spec it can be null:
> {noformat}
> /**
>  * Returns the jar file manifest, or null if none.
>  *
>  * @return the jar file manifest, or null if none
>  *
>  * @throws IllegalStateException
>  * may be thrown if the jar file has been closed
>  * @throws IOException  if an I/O error has occurred
>  */
> public Manifest getManifest() throws IOException {
> return getManifestFromReference();
> }
> {noformat}
> This method should do a null check and return false if the manifest is null 
> (meaning no R code in that jar)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20815) NullPointerException in RPackageUtils#checkManifestForR

2017-05-24 Thread Felix Cheung (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Cheung reassigned SPARK-20815:


Assignee: James Shuster

> NullPointerException in RPackageUtils#checkManifestForR
> ---
>
> Key: SPARK-20815
> URL: https://issues.apache.org/jira/browse/SPARK-20815
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.1.1
>Reporter: Andrew Ash
>Assignee: James Shuster
> Fix For: 2.2.0, 2.3.0
>
>
> Some jars don't have manifest files in them, such as in my case 
> javax.inject-1.jar and value-2.2.1-annotations.jar
> This causes the below NPE:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.spark.deploy.RPackageUtils$.checkManifestForR(RPackageUtils.scala:95)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1$$anonfun$apply$1.apply$mcV$sp(RPackageUtils.scala:180)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1$$anonfun$apply$1.apply(RPackageUtils.scala:180)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1$$anonfun$apply$1.apply(RPackageUtils.scala:180)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1322)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1.apply(RPackageUtils.scala:202)
> at 
> org.apache.spark.deploy.RPackageUtils$$anonfun$checkAndBuildRPackage$1.apply(RPackageUtils.scala:175)
> at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at 
> org.apache.spark.deploy.RPackageUtils$.checkAndBuildRPackage(RPackageUtils.scala:175)
> at 
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:311)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:152)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:118)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {noformat}
> due to RPackageUtils#checkManifestForR assuming {{jar.getManifest}} is 
> non-null.
> However per the JDK spec it can be null:
> {noformat}
> /**
>  * Returns the jar file manifest, or null if none.
>  *
>  * @return the jar file manifest, or null if none
>  *
>  * @throws IllegalStateException
>  * may be thrown if the jar file has been closed
>  * @throws IOException  if an I/O error has occurred
>  */
> public Manifest getManifest() throws IOException {
> return getManifestFromReference();
> }
> {noformat}
> This method should do a null check and return false if the manifest is null 
> (meaning no R code in that jar)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20774) BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation timeouts.

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20774:


Assignee: (was: Apache Spark)

> BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation 
> timeouts.
> ---
>
> Key: SPARK-20774
> URL: https://issues.apache.org/jira/browse/SPARK-20774
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Shixiong Zhu
>
> When broadcasting a table takes too long and triggers timeout, the SQL query 
> will fail. However, the background Spark job is still running and it wastes 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20774) BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation timeouts.

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20774:


Assignee: Apache Spark

> BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation 
> timeouts.
> ---
>
> Key: SPARK-20774
> URL: https://issues.apache.org/jira/browse/SPARK-20774
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Shixiong Zhu
>Assignee: Apache Spark
>
> When broadcasting a table takes too long and triggers timeout, the SQL query 
> will fail. However, the background Spark job is still running and it wastes 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20774) BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation timeouts.

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023335#comment-16023335
 ] 

Apache Spark commented on SPARK-20774:
--

User 'liyichao' has created a pull request for this issue:
https://github.com/apache/spark/pull/18093

> BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation 
> timeouts.
> ---
>
> Key: SPARK-20774
> URL: https://issues.apache.org/jira/browse/SPARK-20774
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Shixiong Zhu
>
> When broadcasting a table takes too long and triggers timeout, the SQL query 
> will fail. However, the background Spark job is still running and it wastes 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20660) Not able to merge Dataframes with different column orders

2017-05-24 Thread Michel Lemay (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1602#comment-1602
 ] 

Michel Lemay commented on SPARK-20660:
--

In my opinion, two schema should be considered the same if columns are the same 
regardless of the order.  

However, throwing an error would be significantly better than doing unexpected 
things.

> Not able to merge Dataframes with different column orders
> -
>
> Key: SPARK-20660
> URL: https://issues.apache.org/jira/browse/SPARK-20660
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: Michel Lemay
>Priority: Minor
>
> Union on two dataframes with different column orders is not supported and 
> lead to hard to find issues.
> Here is an example showing the issue.
> {code}
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.Row
> var inputSchema = StructType(StructField("key", StringType, nullable=true) :: 
> StructField("value", IntegerType, nullable=true) :: Nil)
> var a = spark.createDataFrame(sc.parallelize((1 to 10)).map(x => 
> Row(x.toString, 555)), inputSchema)
> var b = a.select($"value" * 2 alias "value", $"key")  // any transformation 
> changing column order will show the problem.
> a.union(b).show
> // in order to make it work, we need to reorder columns
> val bCols = a.columns.map(aCol => b(aCol))
> a.union(b.select(bCols:_*)).show
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-20871) Only log Janino code in debug mode

2017-05-24 Thread Glen Takahashi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023306#comment-16023306
 ] 

Glen Takahashi edited comment on SPARK-20871 at 5/24/17 5:46 PM:
-

Just attached an example heapdump of Janino logging adding gigabytes worth of 
strings to the heap


was (Author: gtakaha...@palantir.com):
An example of Janino logging adding gigabytes worth of strings to the heap

> Only log Janino code in debug mode
> --
>
> Key: SPARK-20871
> URL: https://issues.apache.org/jira/browse/SPARK-20871
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Glen Takahashi
> Attachments: 6a57e344-3fcf-11e7-85cc-52a06df2a489.png
>
>
> Currently if Janino code compilation fails, it will log the entirety of the 
> code in the executors. Because the generated code can often be very large, 
> the logging can cause heap pressure on the driver and cause it to fall over.
> I propose removing the "$formatted" from here: 
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L964



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20871) Only log Janino code in debug mode

2017-05-24 Thread Glen Takahashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Takahashi updated SPARK-20871:
---
Attachment: 6a57e344-3fcf-11e7-85cc-52a06df2a489.png

An example of Janino logging adding gigabytes worth of strings to the heap

> Only log Janino code in debug mode
> --
>
> Key: SPARK-20871
> URL: https://issues.apache.org/jira/browse/SPARK-20871
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Glen Takahashi
> Attachments: 6a57e344-3fcf-11e7-85cc-52a06df2a489.png
>
>
> Currently if Janino code compilation fails, it will log the entirety of the 
> code in the executors. Because the generated code can often be very large, 
> the logging can cause heap pressure on the driver and cause it to fall over.
> I propose removing the "$formatted" from here: 
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L964



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20871) Only log Janino code in debug mode

2017-05-24 Thread Glen Takahashi (JIRA)
Glen Takahashi created SPARK-20871:
--

 Summary: Only log Janino code in debug mode
 Key: SPARK-20871
 URL: https://issues.apache.org/jira/browse/SPARK-20871
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.1.1
Reporter: Glen Takahashi


Currently if Janino code compilation fails, it will log the entirety of the 
code in the executors. Because the generated code can often be very large, the 
logging can cause heap pressure on the driver and cause it to fall over.

I propose removing the "$formatted" from here: 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L964



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20870) Update the output of spark-sql -H

2017-05-24 Thread Xiao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-20870:

Labels: starter  (was: )

> Update the output of spark-sql -H
> -
>
> Key: SPARK-20870
> URL: https://issues.apache.org/jira/browse/SPARK-20870
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Xiao Li
>  Labels: starter
>
> When we input `./bin/spark-sql -H`, the output is still based on Hive. We 
> need to check whether all of them are working correctly. If not supported, we 
> need to remove it from the list. 
> Also, update the first line to `usage: spark-sql`
> {noformat}
> usage: hive
> -d,--define   Variable subsitution to apply to hive
>   commands. e.g. -d A=B or --define A=B
> --database  Specify the database to use
> -e  SQL from command line
> -f SQL from files
> -H,--helpPrint help information
> --hiveconfUse value for given property
> --hivevar  Variable subsitution to apply to hive
>   commands. e.g. --hivevar A=B
> -i Initialization SQL file
> -S,--silent  Silent mode in interactive shell
> -v,--verbose Verbose mode (echo executed SQL to the
>   console)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20870) Update the output of spark-sql -H

2017-05-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-20870:
---

 Summary: Update the output of spark-sql -H
 Key: SPARK-20870
 URL: https://issues.apache.org/jira/browse/SPARK-20870
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Xiao Li


When we input `./bin/spark-sql -H`, the output is still based on Hive. We need 
to check whether all of them are working correctly. If not supported, we need 
to remove it from the list. 

Also, update the first line to `usage: spark-sql`

{noformat}
usage: hive
-d,--define   Variable subsitution to apply to hive
  commands. e.g. -d A=B or --define A=B
--database  Specify the database to use
-e  SQL from command line
-f SQL from files
-H,--helpPrint help information
--hiveconfUse value for given property
--hivevar  Variable subsitution to apply to hive
  commands. e.g. --hivevar A=B
-i Initialization SQL file
-S,--silent  Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
  console)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-20660) Not able to merge Dataframes with different column orders

2017-05-24 Thread lyc (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023230#comment-16023230
 ] 

lyc edited comment on SPARK-20660 at 5/24/17 5:02 PM:
--

Do you expect spark to fail the query? Because `StructField` is ordered, the 
two schema is different, it seems not good for spark to accept the union.


was (Author: lyc):
Do you expect spark to fail the query?

> Not able to merge Dataframes with different column orders
> -
>
> Key: SPARK-20660
> URL: https://issues.apache.org/jira/browse/SPARK-20660
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: Michel Lemay
>Priority: Minor
>
> Union on two dataframes with different column orders is not supported and 
> lead to hard to find issues.
> Here is an example showing the issue.
> {code}
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.Row
> var inputSchema = StructType(StructField("key", StringType, nullable=true) :: 
> StructField("value", IntegerType, nullable=true) :: Nil)
> var a = spark.createDataFrame(sc.parallelize((1 to 10)).map(x => 
> Row(x.toString, 555)), inputSchema)
> var b = a.select($"value" * 2 alias "value", $"key")  // any transformation 
> changing column order will show the problem.
> a.union(b).show
> // in order to make it work, we need to reorder columns
> val bCols = a.columns.map(aCol => b(aCol))
> a.union(b.select(bCols:_*)).show
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20660) Not able to merge Dataframes with different column orders

2017-05-24 Thread lyc (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023230#comment-16023230
 ] 

lyc commented on SPARK-20660:
-

Do you expect spark to fail the query?

> Not able to merge Dataframes with different column orders
> -
>
> Key: SPARK-20660
> URL: https://issues.apache.org/jira/browse/SPARK-20660
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: Michel Lemay
>Priority: Minor
>
> Union on two dataframes with different column orders is not supported and 
> lead to hard to find issues.
> Here is an example showing the issue.
> {code}
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.Row
> var inputSchema = StructType(StructField("key", StringType, nullable=true) :: 
> StructField("value", IntegerType, nullable=true) :: Nil)
> var a = spark.createDataFrame(sc.parallelize((1 to 10)).map(x => 
> Row(x.toString, 555)), inputSchema)
> var b = a.select($"value" * 2 alias "value", $"key")  // any transformation 
> changing column order will show the problem.
> a.union(b).show
> // in order to make it work, we need to reorder columns
> val bCols = a.columns.map(aCol => b(aCol))
> a.union(b.select(bCols:_*)).show
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20869) Master may clear failed apps when worker down

2017-05-24 Thread lyc (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lyc updated SPARK-20869:

Summary: Master may clear failed apps when worker down  (was: Master should 
clear failed apps when worker down)

> Master may clear failed apps when worker down
> -
>
> Key: SPARK-20869
> URL: https://issues.apache.org/jira/browse/SPARK-20869
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: lyc
>Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> In `Master.removeWorker`, master clears executor and driver state, but does 
> not clear app state. App state is cleared when received 
> `UnregisterApplication` and when `onDisconnect`, the first is when driver 
> shutdown gracefully, the second is called when `netty`'s `channelInActive` is 
> called (which is called when channel is closed), both of which can not handle 
> the case when there is a network partition between master and worker.
> Follow the steps in 
> [SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
> [screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
>  when worker1 partitions with master, the app `app-xxx-000` is still running 
> instead of finished because of worker1 is down.
> cc [~CodingCat]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20869) Master should clear failed apps when worker down

2017-05-24 Thread lyc (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lyc updated SPARK-20869:

Remaining Estimate: 2h  (was: 24h)
 Original Estimate: 2h  (was: 24h)

> Master should clear failed apps when worker down
> 
>
> Key: SPARK-20869
> URL: https://issues.apache.org/jira/browse/SPARK-20869
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: lyc
>Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> In `Master.removeWorker`, master clears executor and driver state, but does 
> not clear app state. App state is cleared when received 
> `UnregisterApplication` and when `onDisconnect`, the first is when driver 
> shutdown gracefully, the second is called when `netty`'s `channelInActive` is 
> called (which is called when channel is closed), both of which can not handle 
> the case when there is a network partition between master and worker.
> Follow the steps in 
> [SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
> [screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
>  when worker1 partitions with master, the app `app-xxx-000` is still running 
> instead of finished because of worker1 is down.
> cc [~CodingCat]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20869) Master should clear failed apps when worker down

2017-05-24 Thread lyc (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lyc updated SPARK-20869:

Description: 
In `Master.removeWorker`, master clears executor and driver state, but does not 
clear app state. App state is cleared when received `UnregisterApplication` and 
when `onDisconnect`, the first is when driver shutdown gracefully, the second 
is called when `netty`'s `channelInActive` is called (which is called when 
channel is closed), both of which can not handle the case when there is a 
network partition between master and worker.

Follow the steps in 
[SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
[screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
 when worker1 partitions with master, the app `app-xxx-000` is still running 
instead of finished because of worker1 is down.

cc [~CodingCat]

  was:
In `Master.removeWorker`, master clears executor and driver state, but does not 
clear app state. App state is cleared when received `UnregisterApplication` and 
when `onDisconnect`, the first is when driver shutdown gracefully, the second 
is called when `netty`'s `channelInActive` is called (which is called when 
channel is closed), both of which can not handle the case when there is a 
network partition between master and worker.

Follow the steps in 
[SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
[screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
 when worker1 partitions with master, the app `app-xxx-000` is still running 
instead of finished because of worker1 is down.

cc @CodingCat


> Master should clear failed apps when worker down
> 
>
> Key: SPARK-20869
> URL: https://issues.apache.org/jira/browse/SPARK-20869
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: lyc
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In `Master.removeWorker`, master clears executor and driver state, but does 
> not clear app state. App state is cleared when received 
> `UnregisterApplication` and when `onDisconnect`, the first is when driver 
> shutdown gracefully, the second is called when `netty`'s `channelInActive` is 
> called (which is called when channel is closed), both of which can not handle 
> the case when there is a network partition between master and worker.
> Follow the steps in 
> [SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
> [screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
>  when worker1 partitions with master, the app `app-xxx-000` is still running 
> instead of finished because of worker1 is down.
> cc [~CodingCat]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20869) Master should clear failed apps when worker down

2017-05-24 Thread lyc (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lyc updated SPARK-20869:

Description: 
In `Master.removeWorker`, master clears executor and driver state, but does not 
clear app state. App state is cleared when received `UnregisterApplication` and 
when `onDisconnect`, the first is when driver shutdown gracefully, the second 
is called when `netty`'s `channelInActive` is called (which is called when 
channel is closed), both of which can not handle the case when there is a 
network partition between master and worker.

Follow the steps in 
[SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
[screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
 when worker1 partitions with master, the app `app-xxx-000` is still running 
instead of finished because of worker1 is down.

cc @CodingCat

  was:
In `Master.removeWorker`, master clears executor and driver state, but does not 
clear app state. App state is cleared when received `UnregisterApplication` and 
when `onDisconnect`, the first is when driver shutdown gracefully, the second 
is called when `netty`'s `channelInActive` is called, both of which can not 
handle the case when there is a network partition between master and worker.

Follow the steps in 
[SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
[screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
 when worker1 partitions with master, the app `app-xxx-000` is still running 
instead of finished because of worker1 is down.

cc @CodingCat


> Master should clear failed apps when worker down
> 
>
> Key: SPARK-20869
> URL: https://issues.apache.org/jira/browse/SPARK-20869
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: lyc
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In `Master.removeWorker`, master clears executor and driver state, but does 
> not clear app state. App state is cleared when received 
> `UnregisterApplication` and when `onDisconnect`, the first is when driver 
> shutdown gracefully, the second is called when `netty`'s `channelInActive` is 
> called (which is called when channel is closed), both of which can not handle 
> the case when there is a network partition between master and worker.
> Follow the steps in 
> [SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
> [screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
>  when worker1 partitions with master, the app `app-xxx-000` is still running 
> instead of finished because of worker1 is down.
> cc @CodingCat



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20869) Master should clear failed apps when worker down

2017-05-24 Thread lyc (JIRA)
lyc created SPARK-20869:
---

 Summary: Master should clear failed apps when worker down
 Key: SPARK-20869
 URL: https://issues.apache.org/jira/browse/SPARK-20869
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: lyc
Priority: Minor


In `Master.removeWorker`, master clears executor and driver state, but does not 
clear app state. App state is cleared when received `UnregisterApplication` and 
when `onDisconnect`, the first is when driver shutdown gracefully, the second 
is called when `netty`'s `channelInActive` is called, both of which can not 
handle the case when there is a network partition between master and worker.

Follow the steps in 
[SPARK-19900|https://issues.apache.org/jira/browse/SPARK-19900], and see the 
[screenshots|https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png]
 when worker1 partitions with master, the app `app-xxx-000` is still running 
instead of finished because of worker1 is down.

cc @CodingCat



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-20848) Dangling threads when reading parquet files in local mode

2017-05-24 Thread Wenchen Fan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-20848.
-
   Resolution: Fixed
 Assignee: Liang-Chi Hsieh
Fix Version/s: 2.2.0
   2.1.2

> Dangling threads when reading parquet files in local mode
> -
>
> Key: SPARK-20848
> URL: https://issues.apache.org/jira/browse/SPARK-20848
> Project: Spark
>  Issue Type: Bug
>  Components: Input/Output, SQL
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Nick Pritchard
>Assignee: Liang-Chi Hsieh
> Fix For: 2.1.2, 2.2.0
>
> Attachments: Screen Shot 2017-05-22 at 4.13.52 PM.png
>
>
> On each call to {{spark.read.parquet}}, a new ForkJoinPool is created. One of 
> the threads in the pool is kept in the {{WAITING}} state, and never stopped, 
> which leads to unbounded growth in number of threads.
> This behavior is a regression from v2.1.0.
> Reproducible example:
> {code}
> val spark = SparkSession
>   .builder()
>   .appName("test")
>   .master("local")
>   .getOrCreate()
> while(true) {
>   spark.read.parquet("/path/to/file")
>   Thread.sleep(5000)
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20868) UnsafeShuffleWriter should verify the position after FileChannel.transferTo

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20868:


Assignee: Apache Spark  (was: Wenchen Fan)

> UnsafeShuffleWriter should verify the position after FileChannel.transferTo
> ---
>
> Key: SPARK-20868
> URL: https://issues.apache.org/jira/browse/SPARK-20868
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20868) UnsafeShuffleWriter should verify the position after FileChannel.transferTo

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023181#comment-16023181
 ] 

Apache Spark commented on SPARK-20868:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/18091

> UnsafeShuffleWriter should verify the position after FileChannel.transferTo
> ---
>
> Key: SPARK-20868
> URL: https://issues.apache.org/jira/browse/SPARK-20868
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20868) UnsafeShuffleWriter should verify the position after FileChannel.transferTo

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20868:


Assignee: Wenchen Fan  (was: Apache Spark)

> UnsafeShuffleWriter should verify the position after FileChannel.transferTo
> ---
>
> Key: SPARK-20868
> URL: https://issues.apache.org/jira/browse/SPARK-20868
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-20868) UnsafeShuffleWriter should verify the position after FileChannel.transferTo

2017-05-24 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-20868:
---

 Summary: UnsafeShuffleWriter should verify the position after 
FileChannel.transferTo
 Key: SPARK-20868
 URL: https://issues.apache.org/jira/browse/SPARK-20868
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.1.0, 2.0.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-20866) Dataset map does not respect nullable field

2017-05-24 Thread Colin Breame (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Breame updated SPARK-20866:
-
Description: 
The Dataset.map does not respect the nullable fields within the schema. 

*Test code:*
(run on spark-shell 2.1.0):

{code}
scala> case class Test(a: Int)
defined class Test

scala> val ds1 = (Test(10) :: Nil).toDS
ds1: org.apache.spark.sql.Dataset[Test] = [a: int]

scala> val ds2 = ds1.map(x => Test(x.a))
ds2: org.apache.spark.sql.Dataset[Test] = [a: int]

scala> ds1.schema == ds2.schema
res65: Boolean = false

scala> ds1.schema
res62: org.apache.spark.sql.types.StructType = 
StructType(StructField(a,IntegerType,false))

scala> ds2.schema
res63: org.apache.spark.sql.types.StructType = 
StructType(StructField(a,IntegerType,true))
{code}

*Expected*
The ds1 should equal ds2. i.e. the schema should be the same.

*Actual*
The schema is not equal - the StructField nullable property is true in ds2 and 
false in ds1.


  was:
The Dataset.map does not respect the nullable fields within the schema. 

*Test code:*
(run on spark-shell 2.1.0):

{code}
scala> val ds1 = (Test(10) :: Nil).toDS
ds1: org.apache.spark.sql.Dataset[Test] = [a: int]

scala> val ds2 = ds1.map(x => Test(x.a))
ds2: org.apache.spark.sql.Dataset[Test] = [a: int]

scala> ds1.schema == ds2.schema
res65: Boolean = false

scala> ds1.schema
res62: org.apache.spark.sql.types.StructType = 
StructType(StructField(a,IntegerType,false))

scala> ds2.schema
res63: org.apache.spark.sql.types.StructType = 
StructType(StructField(a,IntegerType,true))
{code}

*Expected*
The ds1 should equal ds2. i.e. the schema should be the same.

*Actual*
The schema is not equal - the StructField nullable property is true in ds2 and 
false in ds1.



> Dataset map does not respect nullable field 
> 
>
> Key: SPARK-20866
> URL: https://issues.apache.org/jira/browse/SPARK-20866
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: Colin Breame
>
> The Dataset.map does not respect the nullable fields within the schema. 
> *Test code:*
> (run on spark-shell 2.1.0):
> {code}
> scala> case class Test(a: Int)
> defined class Test
> scala> val ds1 = (Test(10) :: Nil).toDS
> ds1: org.apache.spark.sql.Dataset[Test] = [a: int]
> scala> val ds2 = ds1.map(x => Test(x.a))
> ds2: org.apache.spark.sql.Dataset[Test] = [a: int]
> scala> ds1.schema == ds2.schema
> res65: Boolean = false
> scala> ds1.schema
> res62: org.apache.spark.sql.types.StructType = 
> StructType(StructField(a,IntegerType,false))
> scala> ds2.schema
> res63: org.apache.spark.sql.types.StructType = 
> StructType(StructField(a,IntegerType,true))
> {code}
> *Expected*
> The ds1 should equal ds2. i.e. the schema should be the same.
> *Actual*
> The schema is not equal - the StructField nullable property is true in ds2 
> and false in ds1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20250) Improper OOM error when a task been killed while spilling data

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20250:


Assignee: (was: Apache Spark)

> Improper OOM error when a task been killed while spilling data
> --
>
> Key: SPARK-20250
> URL: https://issues.apache.org/jira/browse/SPARK-20250
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0
>Reporter: Feng Zhu
>
> When a task is calling spill() but it receives a killing request from 
> driver (e.g., speculative task), the TaskMemoryManager will throw an OOM 
> exception. 
> Then the executor takes it as UncaughtException, which will be handled by 
> SparkUncaughtExceptionHandler and the executor will consequently be shutdown. 
> However, this error may lead to the whole application failure due to the 
> "max number of executor failures (30) reached". 
> In our production environment, we have encountered a lot of such cases. 
> \\
> {noformat}
> 17/04/05 06:41:27 INFO sort.UnsafeExternalSorter: Thread 115 spilling sort 
> data of 928.0 MB to disk (1 time so far)
> 17/04/05 06:41:27 INFO sort.UnsafeSorterSpillWriter: Spill 
> file:/data/usercache/application_1482394966158_87487271/blockmgr-85c25fa8-06b4/32/temp_local_b731
> 17/04/05 06:41:27 INFO sort.UnsafeSorterSpillWriter: Write numRecords:2097152
> 17/04/05 06:41:30 INFO executor.Executor: Executor is trying to kill task 
> 16.0 in stage 3.0 (TID 857)
> 17/04/05 06:41:30 ERROR memory.TaskMemoryManager: error while calling spill() 
> on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@43a122ed
> java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.updateBytesWritten(DiskBlockObjectWriter.scala:228)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.recordWritten(DiskBlockObjectWriter.scala:207)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:139)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:196)
>   at 
> org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:170)
>   at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
>   at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.growPointerArrayIfNecessary(UnsafeExternalSorter.java:302)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:346)
> 
> 17/04/05 06:41:30 INFO sort.UnsafeExternalSorter: Thread 115 spilling sort 
> data of 928.0 MB to disk (2  times so far)
> 17/04/05 06:41:30 INFO sort.UnsafeSorterSpillWriter: Spill 
> file:/data/usercache/appcache/application_1482394966158_87487271/blockmgr-573312a3-bd46-4c5c-9293-1021cc34c77
> 17/04/05 06:41:30 INFO sort.UnsafeSorterSpillWriter: Write numRecords:2097152
> 17/04/05 06:41:31 ERROR memory.TaskMemoryManager: error while calling spill() 
> on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@43a122ed
> java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.updateBytesWritten(DiskBlockObjectWriter.scala:228)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.recordWritten(DiskBlockObjectWriter.scala:207)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:139)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:196)
>   at 
> org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:170)
>   at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
>   at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
> 
> .
> 17/04/05 06:41:31 WARN memory.TaskMemoryManager: leak 32.0 KB memory from 
> org.apache.spark.shuffle.sort.ShuffleExternalSorter@513661a6
> 17/04/05 06:41:31 ERROR executor.Executor: Managed memory le

[jira] [Assigned] (SPARK-20250) Improper OOM error when a task been killed while spilling data

2017-05-24 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20250:


Assignee: Apache Spark

> Improper OOM error when a task been killed while spilling data
> --
>
> Key: SPARK-20250
> URL: https://issues.apache.org/jira/browse/SPARK-20250
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0
>Reporter: Feng Zhu
>Assignee: Apache Spark
>
> When a task is calling spill() but it receives a killing request from 
> driver (e.g., speculative task), the TaskMemoryManager will throw an OOM 
> exception. 
> Then the executor takes it as UncaughtException, which will be handled by 
> SparkUncaughtExceptionHandler and the executor will consequently be shutdown. 
> However, this error may lead to the whole application failure due to the 
> "max number of executor failures (30) reached". 
> In our production environment, we have encountered a lot of such cases. 
> \\
> {noformat}
> 17/04/05 06:41:27 INFO sort.UnsafeExternalSorter: Thread 115 spilling sort 
> data of 928.0 MB to disk (1 time so far)
> 17/04/05 06:41:27 INFO sort.UnsafeSorterSpillWriter: Spill 
> file:/data/usercache/application_1482394966158_87487271/blockmgr-85c25fa8-06b4/32/temp_local_b731
> 17/04/05 06:41:27 INFO sort.UnsafeSorterSpillWriter: Write numRecords:2097152
> 17/04/05 06:41:30 INFO executor.Executor: Executor is trying to kill task 
> 16.0 in stage 3.0 (TID 857)
> 17/04/05 06:41:30 ERROR memory.TaskMemoryManager: error while calling spill() 
> on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@43a122ed
> java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.updateBytesWritten(DiskBlockObjectWriter.scala:228)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.recordWritten(DiskBlockObjectWriter.scala:207)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:139)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:196)
>   at 
> org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:170)
>   at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
>   at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.growPointerArrayIfNecessary(UnsafeExternalSorter.java:302)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:346)
> 
> 17/04/05 06:41:30 INFO sort.UnsafeExternalSorter: Thread 115 spilling sort 
> data of 928.0 MB to disk (2  times so far)
> 17/04/05 06:41:30 INFO sort.UnsafeSorterSpillWriter: Spill 
> file:/data/usercache/appcache/application_1482394966158_87487271/blockmgr-573312a3-bd46-4c5c-9293-1021cc34c77
> 17/04/05 06:41:30 INFO sort.UnsafeSorterSpillWriter: Write numRecords:2097152
> 17/04/05 06:41:31 ERROR memory.TaskMemoryManager: error while calling spill() 
> on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@43a122ed
> java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.updateBytesWritten(DiskBlockObjectWriter.scala:228)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.recordWritten(DiskBlockObjectWriter.scala:207)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:139)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:196)
>   at 
> org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:170)
>   at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
>   at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
> 
> .
> 17/04/05 06:41:31 WARN memory.TaskMemoryManager: leak 32.0 KB memory from 
> org.apache.spark.shuffle.sort.ShuffleExternalSorter@513661a6
> 17/04/05 06:41:31 ERROR executor.Ex

[jira] [Commented] (SPARK-20250) Improper OOM error when a task been killed while spilling data

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023084#comment-16023084
 ] 

Apache Spark commented on SPARK-20250:
--

User 'ConeyLiu' has created a pull request for this issue:
https://github.com/apache/spark/pull/18090

> Improper OOM error when a task been killed while spilling data
> --
>
> Key: SPARK-20250
> URL: https://issues.apache.org/jira/browse/SPARK-20250
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0
>Reporter: Feng Zhu
>
> When a task is calling spill() but it receives a killing request from 
> driver (e.g., speculative task), the TaskMemoryManager will throw an OOM 
> exception. 
> Then the executor takes it as UncaughtException, which will be handled by 
> SparkUncaughtExceptionHandler and the executor will consequently be shutdown. 
> However, this error may lead to the whole application failure due to the 
> "max number of executor failures (30) reached". 
> In our production environment, we have encountered a lot of such cases. 
> \\
> {noformat}
> 17/04/05 06:41:27 INFO sort.UnsafeExternalSorter: Thread 115 spilling sort 
> data of 928.0 MB to disk (1 time so far)
> 17/04/05 06:41:27 INFO sort.UnsafeSorterSpillWriter: Spill 
> file:/data/usercache/application_1482394966158_87487271/blockmgr-85c25fa8-06b4/32/temp_local_b731
> 17/04/05 06:41:27 INFO sort.UnsafeSorterSpillWriter: Write numRecords:2097152
> 17/04/05 06:41:30 INFO executor.Executor: Executor is trying to kill task 
> 16.0 in stage 3.0 (TID 857)
> 17/04/05 06:41:30 ERROR memory.TaskMemoryManager: error while calling spill() 
> on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@43a122ed
> java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.updateBytesWritten(DiskBlockObjectWriter.scala:228)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.recordWritten(DiskBlockObjectWriter.scala:207)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:139)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:196)
>   at 
> org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:170)
>   at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
>   at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.growPointerArrayIfNecessary(UnsafeExternalSorter.java:302)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:346)
> 
> 17/04/05 06:41:30 INFO sort.UnsafeExternalSorter: Thread 115 spilling sort 
> data of 928.0 MB to disk (2  times so far)
> 17/04/05 06:41:30 INFO sort.UnsafeSorterSpillWriter: Spill 
> file:/data/usercache/appcache/application_1482394966158_87487271/blockmgr-573312a3-bd46-4c5c-9293-1021cc34c77
> 17/04/05 06:41:30 INFO sort.UnsafeSorterSpillWriter: Write numRecords:2097152
> 17/04/05 06:41:31 ERROR memory.TaskMemoryManager: error while calling spill() 
> on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@43a122ed
> java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>   at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.updateBytesWritten(DiskBlockObjectWriter.scala:228)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.recordWritten(DiskBlockObjectWriter.scala:207)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.write(UnsafeSorterSpillWriter.java:139)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:196)
>   at 
> org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:170)
>   at 
> org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
>   at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
> 
> .
> 17/04/05 06:41:31 WARN memory.TaskMemoryManager: leak 32.0 KB memory from 
> org.apac

[jira] [Comment Edited] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-05-24 Thread Wenchen Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023072#comment-16023072
 ] 

Wenchen Fan edited comment on SPARK-18105 at 5/24/17 3:13 PM:
--

can you try to set `spark.file.transferTo` to false and try again? After a 
closer look, I can't find any problems in `LZ4BlockInputStream`, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using `transferTo`


was (Author: cloud_fan):
can you try to set {spark.file.transferTo} to false and try again? After a 
closer look, I can't find any problems in {LZ4BlockInputStream}, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using {transferTo}

> LZ4 failed to decompress a stream of shuffled data
> --
>
> Key: SPARK-18105
> URL: https://issues.apache.org/jira/browse/SPARK-18105
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Reporter: Davies Liu
>Assignee: Davies Liu
>
> When lz4 is used to compress the shuffle files, it may fail to decompress it 
> as "stream is corrupt"
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 92 in stage 5.0 failed 4 times, most recent failure: Lost task 92.3 in 
> stage 5.0 (TID 16616, 10.0.27.18): java.io.IOException: Stream is corrupted
>   at 
> org.apache.spark.io.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:220)
>   at 
> org.apache.spark.io.LZ4BlockInputStream.available(LZ4BlockInputStream.java:109)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:353)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
>   at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
>   at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
>   at 
> org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:397)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> https://github.com/jpountz/lz4-java/issues/89



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-05-24 Thread Wenchen Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023072#comment-16023072
 ] 

Wenchen Fan edited comment on SPARK-18105 at 5/24/17 3:13 PM:
--

can you try to set {spark.file.transferTo} to false and try again? After a 
closer look, I can't find any problems in {LZ4BlockInputStream}, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using {transferTo}


was (Author: cloud_fan):
can you try to set {spark.file.transferTo} to false and try again? After a 
closer look, I can't find any problems in {{{LZ4BlockInputStream}}}, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using {{{transferTo}}}

> LZ4 failed to decompress a stream of shuffled data
> --
>
> Key: SPARK-18105
> URL: https://issues.apache.org/jira/browse/SPARK-18105
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Reporter: Davies Liu
>Assignee: Davies Liu
>
> When lz4 is used to compress the shuffle files, it may fail to decompress it 
> as "stream is corrupt"
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 92 in stage 5.0 failed 4 times, most recent failure: Lost task 92.3 in 
> stage 5.0 (TID 16616, 10.0.27.18): java.io.IOException: Stream is corrupted
>   at 
> org.apache.spark.io.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:220)
>   at 
> org.apache.spark.io.LZ4BlockInputStream.available(LZ4BlockInputStream.java:109)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:353)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
>   at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
>   at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
>   at 
> org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:397)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> https://github.com/jpountz/lz4-java/issues/89



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-05-24 Thread Wenchen Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023072#comment-16023072
 ] 

Wenchen Fan commented on SPARK-18105:
-

can you try to set {{{spark.file.transferTo}}} to false and try again? After a 
closer look, I can't find any problems in {{{LZ4BlockInputStream}}}, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using {{{transferTo}}}

> LZ4 failed to decompress a stream of shuffled data
> --
>
> Key: SPARK-18105
> URL: https://issues.apache.org/jira/browse/SPARK-18105
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Reporter: Davies Liu
>Assignee: Davies Liu
>
> When lz4 is used to compress the shuffle files, it may fail to decompress it 
> as "stream is corrupt"
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 92 in stage 5.0 failed 4 times, most recent failure: Lost task 92.3 in 
> stage 5.0 (TID 16616, 10.0.27.18): java.io.IOException: Stream is corrupted
>   at 
> org.apache.spark.io.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:220)
>   at 
> org.apache.spark.io.LZ4BlockInputStream.available(LZ4BlockInputStream.java:109)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:353)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
>   at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
>   at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
>   at 
> org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:397)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> https://github.com/jpountz/lz4-java/issues/89



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-05-24 Thread Wenchen Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023072#comment-16023072
 ] 

Wenchen Fan edited comment on SPARK-18105 at 5/24/17 3:12 PM:
--

can you try to set {spark.file.transferTo} to false and try again? After a 
closer look, I can't find any problems in {{{LZ4BlockInputStream}}}, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using {{{transferTo}}}


was (Author: cloud_fan):
can you try to set {{{spark.file.transferTo}}} to false and try again? After a 
closer look, I can't find any problems in {{{LZ4BlockInputStream}}}, so I look 
back, and seems there can be a problem when we merge a lot of large shuffle 
files into one using {{{transferTo}}}

> LZ4 failed to decompress a stream of shuffled data
> --
>
> Key: SPARK-18105
> URL: https://issues.apache.org/jira/browse/SPARK-18105
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Reporter: Davies Liu
>Assignee: Davies Liu
>
> When lz4 is used to compress the shuffle files, it may fail to decompress it 
> as "stream is corrupt"
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 92 in stage 5.0 failed 4 times, most recent failure: Lost task 92.3 in 
> stage 5.0 (TID 16616, 10.0.27.18): java.io.IOException: Stream is corrupted
>   at 
> org.apache.spark.io.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:220)
>   at 
> org.apache.spark.io.LZ4BlockInputStream.available(LZ4BlockInputStream.java:109)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:353)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
>   at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
>   at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
>   at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
>   at 
> org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:397)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> https://github.com/jpountz/lz4-java/issues/89



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19281) spark.ml Python API for FPGrowth

2017-05-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023069#comment-16023069
 ] 

Apache Spark commented on SPARK-19281:
--

User 'yanboliang' has created a pull request for this issue:
https://github.com/apache/spark/pull/18089

> spark.ml Python API for FPGrowth
> 
>
> Key: SPARK-19281
> URL: https://issues.apache.org/jira/browse/SPARK-19281
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Reporter: Joseph K. Bradley
>Assignee: Maciej Szymkiewicz
> Fix For: 2.2.0
>
>
> See parent issue.  This is for a Python API *after* the Scala API has been 
> designed and implemented.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-20862) LogisticRegressionModel throws TypeError

2017-05-24 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang resolved SPARK-20862.
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   2.1.2
   2.0.3

> LogisticRegressionModel throws TypeError
> 
>
> Key: SPARK-20862
> URL: https://issues.apache.org/jira/browse/SPARK-20862
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 2.1.1
>Reporter: Bago Amirbekian
>Assignee: Bago Amirbekian
>Priority: Minor
> Fix For: 2.0.3, 2.1.2, 2.2.0
>
>
> LogisticRegressionModel throws a TypeError using python3 and numpy 1.12.1:
> **
> File "/Users/bago/repos/spark/python/pyspark/mllib/classification.py", line 
> 155, in __main__.LogisticRegressionModel
> Failed example:
> mcm = LogisticRegressionWithLBFGS.train(data, iterations=10, numClasses=3)
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/doctest.py",
>  line 1330, in __run
> compileflags, 1), test.globs)
>   File "", line 1, in 
> 
> mcm = LogisticRegressionWithLBFGS.train(data, iterations=10, 
> numClasses=3)
>   File "/Users/bago/repos/spark/python/pyspark/mllib/classification.py", 
> line 398, in train
> return _regression_train_wrapper(train, LogisticRegressionModel, 
> data, initialWeights)
>   File "/Users/bago/repos/spark/python/pyspark/mllib/regression.py", line 
> 216, in _regression_train_wrapper
> return modelClass(weights, intercept, numFeatures, numClasses)
>   File "/Users/bago/repos/spark/python/pyspark/mllib/classification.py", 
> line 176, in __init__
> self._dataWithBiasSize)
> TypeError: 'float' object cannot be interpreted as an integer



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20862) LogisticRegressionModel throws TypeError

2017-05-24 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang reassigned SPARK-20862:
---

Assignee: Bago Amirbekian

> LogisticRegressionModel throws TypeError
> 
>
> Key: SPARK-20862
> URL: https://issues.apache.org/jira/browse/SPARK-20862
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, PySpark
>Affects Versions: 2.1.1
>Reporter: Bago Amirbekian
>Assignee: Bago Amirbekian
>Priority: Minor
>
> LogisticRegressionModel throws a TypeError using python3 and numpy 1.12.1:
> **
> File "/Users/bago/repos/spark/python/pyspark/mllib/classification.py", line 
> 155, in __main__.LogisticRegressionModel
> Failed example:
> mcm = LogisticRegressionWithLBFGS.train(data, iterations=10, numClasses=3)
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/doctest.py",
>  line 1330, in __run
> compileflags, 1), test.globs)
>   File "", line 1, in 
> 
> mcm = LogisticRegressionWithLBFGS.train(data, iterations=10, 
> numClasses=3)
>   File "/Users/bago/repos/spark/python/pyspark/mllib/classification.py", 
> line 398, in train
> return _regression_train_wrapper(train, LogisticRegressionModel, 
> data, initialWeights)
>   File "/Users/bago/repos/spark/python/pyspark/mllib/regression.py", line 
> 216, in _regression_train_wrapper
> return modelClass(weights, intercept, numFeatures, numClasses)
>   File "/Users/bago/repos/spark/python/pyspark/mllib/classification.py", 
> line 176, in __init__
> self._dataWithBiasSize)
> TypeError: 'float' object cannot be interpreted as an integer



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-24 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang reassigned SPARK-20768:
---

Assignee: Yan Facai (颜发才)

> PySpark FPGrowth does not expose numPartitions (expert)  param
> --
>
> Key: SPARK-20768
> URL: https://issues.apache.org/jira/browse/SPARK-20768
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, PySpark
>Affects Versions: 2.2.0
>Reporter: Nick Pentreath
>Assignee: Yan Facai (颜发才)
>Priority: Minor
>
> The PySpark API for {{FPGrowth}} does not expose the {{numPartitions}} param. 
> While it is an "expert" param, the general approach elsewhere is to expose 
> these on the Python side (e.g. {{aggregationDepth}} and intermediate storage 
> params in {{ALS}})



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   >