[jira] [Created] (SPARK-40093) is kubernetes jar required if not using that executor?

2022-08-15 Thread t oo (Jira)
t oo created SPARK-40093:


 Summary: is kubernetes jar required if not using that executor?
 Key: SPARK-40093
 URL: https://issues.apache.org/jira/browse/SPARK-40093
 Project: Spark
  Issue Type: Question
  Components: Deploy
Affects Versions: 3.3.0
Reporter: t oo


my docker file is very big with pyspark

can i remove dis files below if i don't use 'ML'?

14M     /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar

5.9M    
/usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40093) is kubernetes jar required if not using that executor?

2022-08-15 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-40093:
-
Description: 
my docker file is very big with pyspark

can i remove dis files below if i don't use 'kubernetes executor'?

11M     total
4.0M    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-core-5.12.2.jar
840K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-client-5.12.2.jar
760K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-admissionregistration-5.12.2.jar
704K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apiextensions-5.12.2.jar
640K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-autoscaling-5.12.2.jar
528K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-extensions-5.12.2.jar
516K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/spark-kubernetes_2.12-3.3.0.jar
456K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-networking-5.12.2.jar
436K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apps-5.12.2.jar
364K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-storageclass-5.12.2.jar
336K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-policy-5.12.2.jar
264K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-flowcontrol-5.12.2.jar
244K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-batch-5.12.2.jar
192K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-discovery-5.12.2.jar
176K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-rbac-5.12.2.jar
160K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-node-5.12.2.jar
144K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-certificates-5.12.2.jar
104K    
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-events-5.12.2.jar
80K     
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-metrics-5.12.2.jar
68K     
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-scheduling-5.12.2.jar
48K     
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-coordination-5.12.2.jar
20K     
/usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-common-5.12.2.jar

  was:
my docker file is very big with pyspark

can i remove dis files below if i don't use 'ML'?

14M     /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar

5.9M    
/usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar


> is kubernetes jar required if not using that executor?
> --
>
> Key: SPARK-40093
> URL: https://issues.apache.org/jira/browse/SPARK-40093
> Project: Spark
>  Issue Type: Question
>  Components: Deploy
>Affects Versions: 3.3.0
>Reporter: t oo
>Priority: Major
>
> my docker file is very big with pyspark
> can i remove dis files below if i don't use 'kubernetes executor'?
> 11M     total
> 4.0M    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-core-5.12.2.jar
> 840K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-client-5.12.2.jar
> 760K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-admissionregistration-5.12.2.jar
> 704K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apiextensions-5.12.2.jar
> 640K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-autoscaling-5.12.2.jar
> 528K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-extensions-5.12.2.jar
> 516K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-kubernetes_2.12-3.3.0.jar
> 456K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-networking-5.12.2.jar
> 436K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apps-5.12.2.jar
> 364K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-storageclass-5.12.2.jar
> 336K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-policy-5.12.2.jar
> 264K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-flowcontrol-5.12.2.jar
> 244K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-batch-5.12.2.jar
> 192K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-discovery-5.12.2.jar
> 176K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-rbac-5.12.2.jar
> 160K    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-node-5.12.2.jar
> 144K    
> 

[jira] [Created] (SPARK-40092) is breeze required if not using ML?

2022-08-15 Thread t oo (Jira)
t oo created SPARK-40092:


 Summary: is breeze required if not using ML?
 Key: SPARK-40092
 URL: https://issues.apache.org/jira/browse/SPARK-40092
 Project: Spark
  Issue Type: Question
  Components: Deploy
Affects Versions: 3.3.0
Reporter: t oo


my docker file is very big with pyspark

can i remove dis file below if i don't use 'spark streaming'?

35M     
/usr/local/lib/python3.10/site-packages/pyspark/jars/rocksdbjni-6.20.3.jar



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40092) is breeze required if not using ML?

2022-08-15 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-40092:
-
Description: 
my docker file is very big with pyspark

can i remove dis files below if i don't use 'ML'?

14M     /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar

5.9M    
/usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar

  was:
my docker file is very big with pyspark

can i remove dis file below if i don't use 'spark streaming'?

35M     
/usr/local/lib/python3.10/site-packages/pyspark/jars/rocksdbjni-6.20.3.jar


> is breeze required if not using ML?
> ---
>
> Key: SPARK-40092
> URL: https://issues.apache.org/jira/browse/SPARK-40092
> Project: Spark
>  Issue Type: Question
>  Components: Deploy
>Affects Versions: 3.3.0
>Reporter: t oo
>Priority: Major
>
> my docker file is very big with pyspark
> can i remove dis files below if i don't use 'ML'?
> 14M     
> /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar
> 5.9M    
> /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40091) is rocksdbjni required if not using streaming?

2022-08-15 Thread t oo (Jira)
t oo created SPARK-40091:


 Summary: is rocksdbjni required if not using streaming?
 Key: SPARK-40091
 URL: https://issues.apache.org/jira/browse/SPARK-40091
 Project: Spark
  Issue Type: Question
  Components: Deploy
Affects Versions: 3.3.0
Reporter: t oo


my docker file is very big with pyspark

can i remove dis file below if i don't use 'spark streaming'?

35M     
/usr/local/lib/python3.10/site-packages/pyspark/jars/rocksdbjni-6.20.3.jar



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38883) smaller pyspark install if not using streaming?

2022-04-12 Thread t oo (Jira)
t oo created SPARK-38883:


 Summary: smaller pyspark install if not using streaming?
 Key: SPARK-38883
 URL: https://issues.apache.org/jira/browse/SPARK-38883
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Affects Versions: 3.2.1
Reporter: t oo


h3. Describe the feature

i am trying to include pyspark in my docker image, but the size is around 300MB

the largest jar is rocksdbjni-6.20.3.jar at 35MB

is it safe to remove this jar if i have no need for SparkStreaming?

is there any advice on getting the install smaller? perhaps a map of which jars 
are needed for batch vs sql vs streaming?
h3. Use Case

smaller python package means i can pack more concurrent pods on to my eks 
workers



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37420) Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38

2021-11-20 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-37420:
-
Description: 
reading oracle jdbc is not working as expected, i thought a simple df show 
should work.

 
{code:java}
/usr/local/bin/pyspark --driver-class-path "/home/user/extra_jar_spark/*" 
--jars "/home/user/extra_jar_spark/*"

jdbc2DF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:oracle:thin:@redact") \
    .option("driver", "oracle.jdbc.OracleDriver") \
    .option("dbtable", "s.t") \
    .option("user", "redact") \
    .option("password", "redact") \
    .option("fetchsize", 1) \
    .load()
    
jdbc2DF.printSchema()

root
 |-- ID: decimal(38,10) (nullable = true)
 |-- OBJECT_VERSION_NUMBER: decimal(9,0) (nullable = true)
 |-- START_DATE: timestamp (nullable = true)
 |-- END_DATE: timestamp (nullable = true)
 |-- CREATED_BY: decimal(15,0) (nullable = true)
 |-- CREATION_DATE: timestamp (nullable = true)
 |-- LAST_UPDATED_BY: decimal(15,0) (nullable = true)
 |-- LAST_UPDATE_DATE: timestamp (nullable = true)
 |-- LAST_UPDATE_LOGIN: decimal(15,0) (nullable = true)
 |-- CONTINGENCY: string (nullable = true)
 |-- CONTINGENCY_ID: decimal(38,10) (nullable = true) 


jdbc2DF.show()


21/11/20 23:42:00 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3$adapted(JdbcUtils.scala:416)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:367)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:349)
        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
        at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at 
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:131)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
21/11/20 23:42:00 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2) 
(localhost executor driver): java.lang.ArithmeticException: Decimal precision 
49 exceeds max precision 38
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418)
        at 

[jira] [Updated] (SPARK-37420) Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38

2021-11-20 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-37420:
-
Description: 
reading oracle jdbc is not working as expected, i thought a simple df show 
should work.

 
{code:java}
/usr/local/bin/pyspark --driver-class-path "/home/user/extra_jar_spark/*" 
--jars "/home/user/extra_jar_spark/*"

jdbc2DF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:oracle:thin:@redact") \
    .option("driver", "oracle.jdbc.OracleDriver") \
    .option("dbtable", "s.t") \
    .option("user", "redact") \
    .option("password", "redact") \
    .option("fetchsize", 1) \
    .load()
    
jdbc2DF.printSchema()

root
 |-- ID: decimal(38,10) (nullable = true)
 |-- OBJECT_VERSION_NUMBER: decimal(9,0) (nullable = true)
 |-- START_DATE: timestamp (nullable = true)
 |-- END_DATE: timestamp (nullable = true)
 |-- CREATED_BY: decimal(15,0) (nullable = true)
 |-- CREATION_DATE: timestamp (nullable = true)
 |-- LAST_UPDATED_BY: decimal(15,0) (nullable = true)
 |-- LAST_UPDATE_DATE: timestamp (nullable = true)
 |-- LAST_UPDATE_LOGIN: decimal(15,0) (nullable = true)
 |-- CONTINGENCY: string (nullable = true)
 |-- CONTINGENCY_ID: decimal(38,10) (nullable = true) 


jdbc2DF.show()


21/11/20 23:42:00 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3$adapted(JdbcUtils.scala:416)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:367)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:349)
        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
        at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at 
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:131)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
21/11/20 23:42:00 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2) 
(localhost executor driver): java.lang.ArithmeticException: Decimal precision 
49 exceeds max precision 38
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418)
        at 

[jira] [Created] (SPARK-37420) Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38

2021-11-20 Thread t oo (Jira)
t oo created SPARK-37420:


 Summary: Oracle JDBC - java.lang.ArithmeticException: Decimal 
precision 49 exceeds max precision 38
 Key: SPARK-37420
 URL: https://issues.apache.org/jira/browse/SPARK-37420
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.2.0
Reporter: t oo


reading oracle jdbc is not working as expected, i thought a simple df show 
should work.

 
{code:java}
/usr/local/bin/pyspark --driver-class-path "/home/user/extra_jar_spark/*" 
--jars "/home/user/extra_jar_spark/*"

jdbc2DF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:oracle:thin:@redact") \
    .option("driver", "oracle.jdbc.OracleDriver") \
    .option("dbtable", "s.t") \
    .option("user", "redact") \
    .option("password", "redact") \
    .option("fetchsize", 1) \
    .load()
    
jdbc2DF.printSchema()

root
 |-- ID: decimal(38,10) (nullable = true)
 |-- OBJECT_VERSION_NUMBER: decimal(9,0) (nullable = true)
 |-- START_DATE: timestamp (nullable = true)
 |-- END_DATE: timestamp (nullable = true)
 |-- CREATED_BY: decimal(15,0) (nullable = true)
 |-- CREATION_DATE: timestamp (nullable = true)
 |-- LAST_UPDATED_BY: decimal(15,0) (nullable = true)
 |-- LAST_UPDATE_DATE: timestamp (nullable = true)
 |-- LAST_UPDATE_LOGIN: decimal(15,0) (nullable = true)
 |-- CONTINGENCY: string (nullable = true)
 |-- CONTINGENCY_ID: decimal(38,10) (nullable = true) 


jdbc2DF.show()


21/11/20 23:42:00 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3$adapted(JdbcUtils.scala:416)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:367)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:349)
        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
        at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at 
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:131)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
21/11/20 23:42:00 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2) 
(localhost executor driver): java.lang.ArithmeticException: Decimal precision 
49 exceeds max precision 38
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572)
        at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418)
        at 

[jira] [Comment Edited] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-29 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390198#comment-17390198
 ] 

t oo edited comment on SPARK-35974 at 7/29/21, 11:38 PM:
-

same issue on spark 3.1.2:

 
{code:java}
{
  "action" : "SubmissionStatusResponse",
  "driverState" : "ERROR",
  "message" : "Exception from the 
cluster:\njava.nio.file.AccessDeniedException: 
s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: getFileStatus on 
s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: hidden; S3 
Extended Request ID: hideit), S3 Extended Request ID: 
hideit\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)\n\torg.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:799)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:776)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:541)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)",
  "serverSparkVersion" : "3.1.2",
  "submissionId" : "driver-20210729233253-0001",
  "success" : true,
  "workerHostPort" : "10.redact:17537",
  "workerId" : "worker-20210729232355-10.redact-17537"
}
 {code}


was (Author: toopt4):
same issue on spark 3.1.2:

 

{
 "action" : "SubmissionStatusResponse",
 "driverState" : "ERROR",
 "message" : "Exception from the cluster:\njava.nio.file.AccessDeniedException: 
s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: getFileStatus on 
s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: hidden; S3 
Extended Request ID: hideit), S3 Extended Request ID: 
hideit\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)\n\torg.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:799)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:776)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:541)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)",
 "serverSparkVersion" : "3.1.2",
 "submissionId" : "driver-20210729233253-0001",
 "success" : true,
 "workerHostPort" : "10.redact:17537",
 "workerId" : "worker-20210729232355-10.redact-17537"
}

> Spark submit REST cluster/standalone mode - launching an s3a jar with STS
> -
>
> Key: SPARK-35974
> URL: https://issues.apache.org/jira/browse/SPARK-35974
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: t oo
>Priority: Major
>
> {code:java}
> /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master 
> spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
> spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.hadoop.fs.s3a.secret.key='redact2' --conf 
> spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.hadoop.fs.s3a.session.token='redact3' --conf 
> spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
>  --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
> spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
> --total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
> 

[jira] [Updated] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-29 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-35974:
-
Affects Version/s: (was: 2.4.8)
   3.1.2

> Spark submit REST cluster/standalone mode - launching an s3a jar with STS
> -
>
> Key: SPARK-35974
> URL: https://issues.apache.org/jira/browse/SPARK-35974
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: t oo
>Priority: Major
>
> {code:java}
> /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master 
> spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
> spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.hadoop.fs.s3a.secret.key='redact2' --conf 
> spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.hadoop.fs.s3a.session.token='redact3' --conf 
> spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
>  --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
> spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
> --total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
> --driver-memory 1g --name lin1 --deploy-mode cluster --conf 
> spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
> s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
> {code}
> running the above command give below stack trace:
>  
> {code:java}
>  Exception from the cluster:\njava.nio.file.AccessDeniedException: 
> s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
> s3a://mybuc/metorikku_2.11.jar: 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
> Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
> org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
> org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
> org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
> org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
> org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
> org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
> org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
> all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
> jar itself reads csvs from s3 using the tokens, and everything works if 
> either 1. i change the commandline to point to local jars on the ec2 OR 2. 
> use port 7077/client mode instead of cluster mode. But it seems the jar 
> itself can't be launched off s3, as if the tokens are not being picked up 
> properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-29 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390198#comment-17390198
 ] 

t oo commented on SPARK-35974:
--

same issue on spark 3.1.2:

 

{
 "action" : "SubmissionStatusResponse",
 "driverState" : "ERROR",
 "message" : "Exception from the cluster:\njava.nio.file.AccessDeniedException: 
s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: getFileStatus on 
s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: hidden; S3 
Extended Request ID: hideit), S3 Extended Request ID: 
hideit\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)\n\torg.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:799)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:776)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:541)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)",
 "serverSparkVersion" : "3.1.2",
 "submissionId" : "driver-20210729233253-0001",
 "success" : true,
 "workerHostPort" : "10.redact:17537",
 "workerId" : "worker-20210729232355-10.redact-17537"
}

> Spark submit REST cluster/standalone mode - launching an s3a jar with STS
> -
>
> Key: SPARK-35974
> URL: https://issues.apache.org/jira/browse/SPARK-35974
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: t oo
>Priority: Major
>
> {code:java}
> /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master 
> spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
> spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.hadoop.fs.s3a.secret.key='redact2' --conf 
> spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.hadoop.fs.s3a.session.token='redact3' --conf 
> spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
>  --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
> spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
> --total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
> --driver-memory 1g --name lin1 --deploy-mode cluster --conf 
> spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
> s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
> {code}
> running the above command give below stack trace:
>  
> {code:java}
>  Exception from the cluster:\njava.nio.file.AccessDeniedException: 
> s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
> s3a://mybuc/metorikku_2.11.jar: 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
> Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
> org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
> org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
> org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
> org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
> org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
> org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
> org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
> all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
> jar itself reads csvs from s3 using the tokens, and everything works if 
> 

[jira] [Reopened] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-29 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo reopened SPARK-35974:
--

> Spark submit REST cluster/standalone mode - launching an s3a jar with STS
> -
>
> Key: SPARK-35974
> URL: https://issues.apache.org/jira/browse/SPARK-35974
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.8
>Reporter: t oo
>Priority: Major
>
> {code:java}
> /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master 
> spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
> spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.hadoop.fs.s3a.secret.key='redact2' --conf 
> spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.hadoop.fs.s3a.session.token='redact3' --conf 
> spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
>  --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
> spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
> --total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
> --driver-memory 1g --name lin1 --deploy-mode cluster --conf 
> spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
> s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
> {code}
> running the above command give below stack trace:
>  
> {code:java}
>  Exception from the cluster:\njava.nio.file.AccessDeniedException: 
> s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
> s3a://mybuc/metorikku_2.11.jar: 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
> Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
> org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
> org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
> org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
> org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
> org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
> org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
> org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
> all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
> jar itself reads csvs from s3 using the tokens, and everything works if 
> either 1. i change the commandline to point to local jars on the ec2 OR 2. 
> use port 7077/client mode instead of cluster mode. But it seems the jar 
> itself can't be launched off s3, as if the tokens are not being picked up 
> properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-36147) [SQL] - log level should be warning if files not found in BasicWriteStatsTracker

2021-07-14 Thread t oo (Jira)
t oo created SPARK-36147:


 Summary: [SQL] - log level should be warning if files not found in 
BasicWriteStatsTracker
 Key: SPARK-36147
 URL: https://issues.apache.org/jira/browse/SPARK-36147
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.2
Reporter: t oo


This log should at least be WARN not INFO (in 
[org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala|https://github.com/apache/spark/pull/2#diff-2a68131bcf611ecf51956d2ba63e8a9ee6d1d20a7a54eeeb4195a4ba1d365134]
 )

 

"Expected $numSubmittedFiles files, but only saw $numFiles."

 

in my case (using s3) just yesterday, the file was never created and job didn't 
fail! so I ended up with silently missing data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-02 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-35974:
-
Affects Version/s: (was: 2.4.6)
   2.4.8
  Description: 
{code:java}
/var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master 
spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
spark.hadoop.fs.s3a.secret.key='redact2' --conf 
spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
spark.hadoop.fs.s3a.session.token='redact3' --conf 
spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
 --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
-DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
-DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
--total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
--driver-memory 1g --name lin1 --deploy-mode cluster --conf 
spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
{code}
running the above command give below stack trace:

 
{code:java}
 Exception from the cluster:\njava.nio.file.AccessDeniedException: 
s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
s3a://mybuc/metorikku_2.11.jar: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
jar itself reads csvs from s3 using the tokens, and everything works if either 
1. i change the commandline to point to local jars on the ec2 OR 2. use port 
7077/client mode instead of cluster mode. But it seems the jar itself can't be 
launched off s3, as if the tokens are not being picked up properly.

  was:
{code:java}
/var/lib/spark-2.3.4-bin-hadoop2.7/bin/spark-submit --master 
spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
spark.hadoop.fs.s3a.secret.key='redact2' --conf 
spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
spark.hadoop.fs.s3a.session.token='redact3' --conf 
spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
 --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
-DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
-DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
--total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
--driver-memory 1g --name lin1 --deploy-mode cluster --conf 
spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
{code}
running the above command give below stack trace:

 
{code:java}
 Exception from the cluster:\njava.nio.file.AccessDeniedException: 
s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
s3a://mybuc/metorikku_2.11.jar: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)

[jira] [Reopened] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-02 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo reopened SPARK-35974:
--

v2.4.8 is less than 2 months old

> Spark submit REST cluster/standalone mode - launching an s3a jar with STS
> -
>
> Key: SPARK-35974
> URL: https://issues.apache.org/jira/browse/SPARK-35974
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.8
>Reporter: t oo
>Priority: Major
>
> {code:java}
> /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master 
> spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
> spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
> spark.hadoop.fs.s3a.secret.key='redact2' --conf 
> spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
> spark.hadoop.fs.s3a.session.token='redact3' --conf 
> spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
>  --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
> spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
> -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
> --total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
> --driver-memory 1g --name lin1 --deploy-mode cluster --conf 
> spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
> s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
> {code}
> running the above command give below stack trace:
>  
> {code:java}
>  Exception from the cluster:\njava.nio.file.AccessDeniedException: 
> s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
> s3a://mybuc/metorikku_2.11.jar: 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
> Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
> org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
> org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
> org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
> org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
> org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
> org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
> org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
> all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
> jar itself reads csvs from s3 using the tokens, and everything works if 
> either 1. i change the commandline to point to local jars on the ec2 OR 2. 
> use port 7077/client mode instead of cluster mode. But it seems the jar 
> itself can't be launched off s3, as if the tokens are not being picked up 
> properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS

2021-07-01 Thread t oo (Jira)
t oo created SPARK-35974:


 Summary: Spark submit REST cluster/standalone mode - launching an 
s3a jar with STS
 Key: SPARK-35974
 URL: https://issues.apache.org/jira/browse/SPARK-35974
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.6
Reporter: t oo


{code:java}
/var/lib/spark-2.3.4-bin-hadoop2.7/bin/spark-submit --master 
spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf 
spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf 
spark.hadoop.fs.s3a.secret.key='redact2' --conf 
spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf 
spark.hadoop.fs.s3a.session.token='redact3' --conf 
spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf 
spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf 
spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
 --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
-DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf 
spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 
-DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' 
--total-executor-cores 4 --executor-cores 2 --executor-memory 2g 
--driver-memory 1g --name lin1 --deploy-mode cluster --conf 
spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku 
s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml
{code}
running the above command give below stack trace:

 
{code:java}
 Exception from the cluster:\njava.nio.file.AccessDeniedException: 
s3a://mybuc/metorikku_2.11.jar: getFileStatus on 
s3a://mybuc/metorikku_2.11.jar: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended 
Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)
org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)
org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747)
org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723)
org.apache.spark.util.Utils$.fetchFile(Utils.scala:509)
org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)
org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code}
all the ec2s in the spark cluster only have access to s3 via STS tokens. The 
jar itself reads csvs from s3 using the tokens, and everything works if either 
1. i change the commandline to point to local jars on the ec2 OR 2. use port 
7077/client mode instead of cluster mode. But it seems the jar itself can't be 
launched off s3, as if the tokens are not being picked up properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-35249) to_timestamp can't parse 6 digit microsecond SSSSSS

2021-04-27 Thread t oo (Jira)
t oo created SPARK-35249:


 Summary: to_timestamp can't parse 6 digit microsecond SS
 Key: SPARK-35249
 URL: https://issues.apache.org/jira/browse/SPARK-35249
 Project: Spark
  Issue Type: Wish
  Components: SQL
Affects Versions: 2.4.6
Reporter: t oo


spark-sql> select x, to_timestamp(x,"MMM dd  hh:mm:ss.SS") from (select 
'Apr 13 2021 12:00:00.001000AM' x);
Apr 13 2021 12:00:00.001000AM NULL

 

Why doesn't the to_timestamp work?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-32924) Web UI sort on duration is wrong

2020-11-21 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-32924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17236762#comment-17236762
 ] 

t oo commented on SPARK-32924:
--

[~pralabhkumar] yes

> Web UI sort on duration is wrong
> 
>
> Key: SPARK-32924
> URL: https://issues.apache.org/jira/browse/SPARK-32924
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Major
> Attachments: ui_sort.png
>
>
> See attachment, 9 s(econds) is showing as larger than 8.1min



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33085) "Master removed our application" error leads to FAILED driver status instead of KILLED driver status

2020-10-14 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213815#comment-17213815
 ] 

t oo commented on SPARK-33085:
--

# start long running spark job in cluster mode
 # terminate all spark workers
 # check status of driver

> "Master removed our application" error leads to FAILED driver status instead 
> of KILLED driver status
> 
>
> Key: SPARK-33085
> URL: https://issues.apache.org/jira/browse/SPARK-33085
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Major
>
>  
> driver-20200930160855-0316 exited with status FAILED
>  
> I am using Spark Standalone scheduler with spot ec2 workers. I confirmed that 
> myip.87 EC2 instance was terminated at 2020-09-30 16:16
>  
> *I would expect the overall driver status to be KILLED but instead it was 
> FAILED*, my goal is to interpret FAILED status as 'don't rerun as 
> non-transient error faced' but KILLED/ERROR status as 'yes, rerun as 
> transient error faced'. But it looks like FAILED status is being set in below 
> case of transient error:
>   
> Below are driver logs
> {code:java}
> 2020-09-30 16:12:41,183 [main] INFO  
> com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to 
> s3a://redacted2020-09-30 16:12:41,183 [main] INFO  
> com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to 
> s3a://redacted20-09-30 16:16:40,366 [dispatcher-event-loop-15] ERROR 
> org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on myip.87: 
> Remote RPC client disassociated. Likely due to containers exceeding 
> thresholds, or network issues. Check driver logs for WARN messages.2020-09-30 
> 16:16:40,372 [dispatcher-event-loop-15] WARN  
> org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 6.0 (TID 
> 6, myip.87, executor 0): ExecutorLostFailure (executor 0 exited caused by one 
> of the running tasks) Reason: Remote RPC client disassociated. Likely due to 
> containers exceeding thresholds, or network issues. Check driver logs for 
> WARN messages.2020-09-30 16:16:40,376 [dispatcher-event-loop-13] WARN  
> org.apache.spark.storage.BlockManagerMasterEndpoint - No more replicas 
> available for rdd_3_0 !2020-09-30 16:16:40,398 [dispatcher-event-loop-2] INFO 
>  org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
> app-20200930160902-0895/0 removed: Worker shutting down2020-09-30 
> 16:16:40,399 [dispatcher-event-loop-2] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
> executor ID app-20200930160902-0895/1 on hostPort myip.87:11647 with 2 
> core(s), 5.0 GB RAM2020-09-30 16:16:40,401 [dispatcher-event-loop-5] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
> app-20200930160902-0895/1 removed: java.lang.IllegalStateException: Shutdown 
> hooks cannot be modified during shutdown.2020-09-30 16:16:40,402 
> [dispatcher-event-loop-5] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
> executor ID app-20200930160902-0895/2 on hostPort myip.87:11647 with 2 
> core(s), 5.0 GB RAM2020-09-30 16:16:40,403 [dispatcher-event-loop-11] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
> app-20200930160902-0895/2 removed: java.lang.IllegalStateException: Shutdown 
> hooks cannot be modified during shutdown.2020-09-30 16:16:40,404 
> [dispatcher-event-loop-11] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
> executor ID app-20200930160902-0895/3 on hostPort myip.87:11647 with 2 
> core(s), 5.0 GB RAM2020-09-30 16:16:40,405 [dispatcher-event-loop-1] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
> app-20200930160902-0895/3 removed: java.lang.IllegalStateException: Shutdown 
> hooks cannot be modified during shutdown.2020-09-30 16:16:40,406 
> [dispatcher-event-loop-1] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
> executor ID app-20200930160902-0895/4 on hostPort myip.87:11647 with 2 
> core(s), 5.0 GB RAM2020-09-30 16:16:40,407 [dispatcher-event-loop-12] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
> app-20200930160902-0895/4 removed: java.lang.IllegalStateException: Shutdown 
> hooks cannot be modified during shutdown.2020-09-30 16:16:40,408 
> [dispatcher-event-loop-12] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
> executor ID app-20200930160902-0895/5 on hostPort myip.87:11647 with 2 
> core(s), 5.0 GB RAM2020-09-30 16:16:40,409 [dispatcher-event-loop-4] INFO  
> org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - 

[jira] [Commented] (SPARK-27733) Upgrade to Avro 1.10.0

2020-10-11 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211955#comment-17211955
 ] 

t oo commented on SPARK-27733:
--

hive 1 gone I think

> Upgrade to Avro 1.10.0
> --
>
> Key: SPARK-27733
> URL: https://issues.apache.org/jira/browse/SPARK-27733
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.1.0
>Reporter: Ismaël Mejía
>Priority: Minor
>
> Avro 1.9.2 was released with many nice features including reduced size (1MB 
> less), and removed dependencies, no paranamer, no shaded guava, security 
> updates, so probably a worth upgrade.
> Avro 1.10.0 was released and this is still not done.
> There is at the moment (2020/08) still a blocker because of Hive related 
> transitive dependencies bringing older versions of Avro, so we could say that 
> this is somehow still blocked until HIVE-21737 is solved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-33085) "Master removed our application" error leads to FAILED driver status instead of KILLED driver status

2020-10-07 Thread t oo (Jira)
t oo created SPARK-33085:


 Summary: "Master removed our application" error leads to FAILED 
driver status instead of KILLED driver status
 Key: SPARK-33085
 URL: https://issues.apache.org/jira/browse/SPARK-33085
 Project: Spark
  Issue Type: Bug
  Components: Scheduler, Spark Core
Affects Versions: 2.4.6
Reporter: t oo


 

driver-20200930160855-0316 exited with status FAILED

 

I am using Spark Standalone scheduler with spot ec2 workers. I confirmed that 
myip.87 EC2 instance was terminated at 2020-09-30 16:16

 

*I would expect the overall driver status to be KILLED but instead it was 
FAILED*, my goal is to interpret FAILED status as 'don't rerun as non-transient 
error faced' but KILLED/ERROR status as 'yes, rerun as transient error faced'. 
But it looks like FAILED status is being set in below case of transient error:

  

Below are driver logs
{code:java}
2020-09-30 16:12:41,183 [main] INFO  
com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to 
s3a://redacted2020-09-30 16:12:41,183 [main] INFO  
com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to 
s3a://redacted20-09-30 16:16:40,366 [dispatcher-event-loop-15] ERROR 
org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on myip.87: 
Remote RPC client disassociated. Likely due to containers exceeding thresholds, 
or network issues. Check driver logs for WARN messages.2020-09-30 16:16:40,372 
[dispatcher-event-loop-15] WARN  org.apache.spark.scheduler.TaskSetManager - 
Lost task 0.0 in stage 6.0 (TID 6, myip.87, executor 0): ExecutorLostFailure 
(executor 0 exited caused by one of the running tasks) Reason: Remote RPC 
client disassociated. Likely due to containers exceeding thresholds, or network 
issues. Check driver logs for WARN messages.2020-09-30 16:16:40,376 
[dispatcher-event-loop-13] WARN  
org.apache.spark.storage.BlockManagerMasterEndpoint - No more replicas 
available for rdd_3_0 !2020-09-30 16:16:40,398 [dispatcher-event-loop-2] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/0 removed: Worker shutting down2020-09-30 16:16:40,399 
[dispatcher-event-loop-2] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
executor ID app-20200930160902-0895/1 on hostPort myip.87:11647 with 2 core(s), 
5.0 GB RAM2020-09-30 16:16:40,401 [dispatcher-event-loop-5] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/1 removed: java.lang.IllegalStateException: Shutdown 
hooks cannot be modified during shutdown.2020-09-30 16:16:40,402 
[dispatcher-event-loop-5] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
executor ID app-20200930160902-0895/2 on hostPort myip.87:11647 with 2 core(s), 
5.0 GB RAM2020-09-30 16:16:40,403 [dispatcher-event-loop-11] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/2 removed: java.lang.IllegalStateException: Shutdown 
hooks cannot be modified during shutdown.2020-09-30 16:16:40,404 
[dispatcher-event-loop-11] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
executor ID app-20200930160902-0895/3 on hostPort myip.87:11647 with 2 core(s), 
5.0 GB RAM2020-09-30 16:16:40,405 [dispatcher-event-loop-1] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/3 removed: java.lang.IllegalStateException: Shutdown 
hooks cannot be modified during shutdown.2020-09-30 16:16:40,406 
[dispatcher-event-loop-1] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
executor ID app-20200930160902-0895/4 on hostPort myip.87:11647 with 2 core(s), 
5.0 GB RAM2020-09-30 16:16:40,407 [dispatcher-event-loop-12] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/4 removed: java.lang.IllegalStateException: Shutdown 
hooks cannot be modified during shutdown.2020-09-30 16:16:40,408 
[dispatcher-event-loop-12] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
executor ID app-20200930160902-0895/5 on hostPort myip.87:11647 with 2 core(s), 
5.0 GB RAM2020-09-30 16:16:40,409 [dispatcher-event-loop-4] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/5 removed: java.lang.IllegalStateException: Shutdown 
hooks cannot be modified during shutdown.2020-09-30 16:16:40,410 
[dispatcher-event-loop-5] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted 
executor ID app-20200930160902-0895/6 on hostPort myip.87:11647 with 2 core(s), 
5.0 GB RAM2020-09-30 16:16:40,420 [dispatcher-event-loop-9] INFO  
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor 
app-20200930160902-0895/6 removed: 

[jira] [Created] (SPARK-33053) Document driver states (difference between FAILED/ERROR/KILLED)

2020-10-02 Thread t oo (Jira)
t oo created SPARK-33053:


 Summary: Document driver states (difference between 
FAILED/ERROR/KILLED)
 Key: SPARK-33053
 URL: https://issues.apache.org/jira/browse/SPARK-33053
 Project: Spark
  Issue Type: Documentation
  Components: docs, Documentation
Affects Versions: 2.4.6
Reporter: t oo


Looking at Spark website i could not find any documentation on difference 
between these driver states:

ERROR

FAILED

KILLED



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-32924) Web UI sort on duration is wrong

2020-09-17 Thread t oo (Jira)
t oo created SPARK-32924:


 Summary: Web UI sort on duration is wrong
 Key: SPARK-32924
 URL: https://issues.apache.org/jira/browse/SPARK-32924
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.4.6
Reporter: t oo
 Attachments: ui_sort.png

See attachment, 9 s(econds) is showing as larger than 8.1min



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32924) Web UI sort on duration is wrong

2020-09-17 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-32924:
-
Attachment: ui_sort.png

> Web UI sort on duration is wrong
> 
>
> Key: SPARK-32924
> URL: https://issues.apache.org/jira/browse/SPARK-32924
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Major
> Attachments: ui_sort.png
>
>
> See attachment, 9 s(econds) is showing as larger than 8.1min



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-32373) Spark Standalone - RetryingBlockFetcher tries to get block from worker even 10mins after it was de-registered from spark cluster

2020-07-20 Thread t oo (Jira)
t oo created SPARK-32373:


 Summary: Spark Standalone - RetryingBlockFetcher tries to get 
block from worker even 10mins after it was de-registered from spark cluster
 Key: SPARK-32373
 URL: https://issues.apache.org/jira/browse/SPARK-32373
 Project: Spark
  Issue Type: Bug
  Components: Block Manager, Scheduler, Shuffle, Spark Core
Affects Versions: 2.4.6
Reporter: t oo


Using spark standalone in 2.4.6 with spot ec2 instances, the .242 IP instance 
was terminated at 12:00:11pm, before then it appeared registered in Spark UI as 
ALIVE for few hours, it then appeared in Spark UI as DEAD until 12:16pm, then 
it disappeared from Spark UI completely. An app that started at 11:24am had 
below error. As you can see in below app log from another worker it is still 
trying to get shuffle block from .242 IP at 12:10pm (10mins after the worker 
was removed from the spark cluster). I would expect that once within 2mins of 
the worker being removed from the cluster that it would stop retrying

 
{code:java}

2020-07-20 12:10:02,702 [Block Fetch Retry-9-3] ERROR 
org.apache.spark.network.shuffle.RetryingBlockFetcher - Exception while 
beginning fetch of 1 outstanding blocks (after 3 retries)
java.io.IOException: Connecting to /redact.242:7337 timed out (12 ms)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:243)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at 
org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.lambda$initiateRetry$0(RetryingBlockFetcher.java:169)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
2020-07-20 12:07:57,700 [Block Fetch Retry-9-2] ERROR 
org.apache.spark.network.shuffle.RetryingBlockFetcher - Exception while 
beginning fetch of 1 outstanding blocks (after 2 retries)
java.io.IOException: Connecting to /redact.242:7337 timed out (12 ms)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:243)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at 
org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.lambda$initiateRetry$0(RetryingBlockFetcher.java:169)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
2020-07-20 12:05:52,697 [Block Fetch Retry-9-1] ERROR 
org.apache.spark.network.shuffle.RetryingBlockFetcher - Exception while 
beginning fetch of 1 outstanding blocks (after 1 retries)
java.io.IOException: Connecting to /redact.242:7337 timed out (12 ms)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:243)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at 
org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.lambda$initiateRetry$0(RetryingBlockFetcher.java:169)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at 

[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED

2020-07-06 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-32197:
-
Description: 
App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
driver to fail if app fails.

 

Thread dump from jstack (on the driver pid) attached (j1.out)

Last part of stdout driver log attached (full log is 23MB, stderr log just has 
launch command)

Last part of app logs attached

 

Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook called" 
 line never appears in the driver log after "org.apache.spark.SparkContext - 
Successfully stopped SparkContext"

 

Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
6066) in cluster mode was used. Other drivers/apps have worked fine with this 
setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
terminate at any time. From checking aws logs: the worker was terminated at 
01:53:38

 

I think you can replicate this by tearing down worker machine while app is 
running. You might have to try several times.

 

Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!

 

  was:
App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
driver to fail if app fails.

 

Thread dump from jstack (on the driver pid) attached (j1.out)

Last part of stdout driver log attached (full log is 23MB, stderr log just has 
launch command)

Last part of app logs attached

 

Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook called" 
 ine never appears in the driver log after "org.apache.spark.SparkContext - 
Successfully stopped SparkContext"

 

Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
6066) in cluster mode was used. Other drivers/apps have worked fine with this 
setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
terminate at any time. From checking aws logs: the worker was terminated at 
01:53:38

 

I think you can replicate this by tearing down worker machine while app is 
running. You might have to try several times.

 

Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!

 


> 'Spark driver' stays running even though 'spark application' has FAILED
> ---
>
> Key: SPARK-32197
> URL: https://issues.apache.org/jira/browse/SPARK-32197
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Blocker
> Attachments: app_executors.png, applog.txt, driverlog.txt, 
> failed1.png, failed_stages.png, failedapp.png, j1.out, stuckdriver.png
>
>
> App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
> driver to fail if app fails.
>  
> Thread dump from jstack (on the driver pid) attached (j1.out)
> Last part of stdout driver log attached (full log is 23MB, stderr log just 
> has launch command)
> Last part of app logs attached
>  
> Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook 
> called"  line never appears in the driver log after 
> "org.apache.spark.SparkContext - Successfully stopped SparkContext"
>  
> Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
> 6066) in cluster mode was used. Other drivers/apps have worked fine with this 
> setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
> master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
> terminate at any time. From checking aws logs: the worker was terminated at 
> 01:53:38
>  
> I think you can replicate this by tearing down worker machine while app is 
> running. You might have to try several times.
>  
> Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED

2020-07-06 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-32197:
-
Description: 
App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
driver to fail if app fails.

 

Thread dump from jstack (on the driver pid) attached (j1.out)

Last part of stdout driver log attached (full log is 23MB, stderr log just has 
launch command)

Last part of app logs attached

 

Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook called" 
 ine never appears in the driver log after "org.apache.spark.SparkContext - 
Successfully stopped SparkContext"

 

Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
6066) in cluster mode was used. Other drivers/apps have worked fine with this 
setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
terminate at any time. From checking aws logs: the worker was terminated at 
01:53:38

 

I think you can replicate this by tearing down worker machine while app is 
running. You might have to try several times.

 

Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!

 

  was:
App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
driver to fail if app fails.

 

Thread dump from jstack (on the driver pid) attached (j1.out)

Last part of stdout driver log attached (full log is 23MB, stderr log just has 
launch command)

Last part of app logs attached

 

 

 

Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
6066) in cluster mode was used. Other drivers/apps have worked fine with this 
setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
terminate at any time. From checking aws logs: the worker was terminated at 
01:53:38

 

I think you can replicate this by tearing down worker machine while app is 
running. You might have to try several times.

 

Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!

 


> 'Spark driver' stays running even though 'spark application' has FAILED
> ---
>
> Key: SPARK-32197
> URL: https://issues.apache.org/jira/browse/SPARK-32197
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Blocker
> Attachments: app_executors.png, applog.txt, driverlog.txt, 
> failed1.png, failed_stages.png, failedapp.png, j1.out, stuckdriver.png
>
>
> App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
> driver to fail if app fails.
>  
> Thread dump from jstack (on the driver pid) attached (j1.out)
> Last part of stdout driver log attached (full log is 23MB, stderr log just 
> has launch command)
> Last part of app logs attached
>  
> Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook 
> called"  ine never appears in the driver log after 
> "org.apache.spark.SparkContext - Successfully stopped SparkContext"
>  
> Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
> 6066) in cluster mode was used. Other drivers/apps have worked fine with this 
> setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
> master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
> terminate at any time. From checking aws logs: the worker was terminated at 
> 01:53:38
>  
> I think you can replicate this by tearing down worker machine while app is 
> running. You might have to try several times.
>  
> Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED

2020-07-06 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-32197:
-
Attachment: app_executors.png
failed_stages.png
failed1.png
stuckdriver.png
failedapp.png

> 'Spark driver' stays running even though 'spark application' has FAILED
> ---
>
> Key: SPARK-32197
> URL: https://issues.apache.org/jira/browse/SPARK-32197
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Blocker
> Attachments: app_executors.png, applog.txt, driverlog.txt, 
> failed1.png, failed_stages.png, failedapp.png, j1.out, stuckdriver.png
>
>
> App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
> driver to fail if app fails.
>  
> Thread dump from jstack (on the driver pid) attached (j1.out)
> Last part of stdout driver log attached (full log is 23MB, stderr log just 
> has launch command)
> Last part of app logs attached
>  
>  
>  
> Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
> 6066) in cluster mode was used. Other drivers/apps have worked fine with this 
> setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
> master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
> terminate at any time. From checking aws logs: the worker was terminated at 
> 01:53:38
>  
> I think you can replicate this by tearing down worker machine while app is 
> running. You might have to try several times.
>  
> Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED

2020-07-06 Thread t oo (Jira)
t oo created SPARK-32197:


 Summary: 'Spark driver' stays running even though 'spark 
application' has FAILED
 Key: SPARK-32197
 URL: https://issues.apache.org/jira/browse/SPARK-32197
 Project: Spark
  Issue Type: Bug
  Components: Scheduler, Spark Core
Affects Versions: 2.4.6
Reporter: t oo
 Attachments: applog.txt, driverlog.txt, j1.out

App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
driver to fail if app fails.

 

Thread dump from jstack (on the driver pid) attached (j1.out)

Last part of stdout driver log attached (full log is 23MB, stderr log just has 
launch command)

Last part of app logs attached

 

 

 

Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
6066) in cluster mode was used. Other drivers/apps have worked fine with this 
setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
terminate at any time. From checking aws logs: the worker was terminated at 
01:53:38

 

I think you can replicate this by tearing down worker machine while app is 
running. You might have to try several times.

 

Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED

2020-07-06 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-32197:
-
Attachment: applog.txt
j1.out
driverlog.txt

> 'Spark driver' stays running even though 'spark application' has FAILED
> ---
>
> Key: SPARK-32197
> URL: https://issues.apache.org/jira/browse/SPARK-32197
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.4.6
>Reporter: t oo
>Priority: Blocker
> Attachments: applog.txt, driverlog.txt, j1.out
>
>
> App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect 
> driver to fail if app fails.
>  
> Thread dump from jstack (on the driver pid) attached (j1.out)
> Last part of stdout driver log attached (full log is 23MB, stderr log just 
> has launch command)
> Last part of app logs attached
>  
>  
>  
> Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 
> 6066) in cluster mode was used. Other drivers/apps have worked fine with this 
> setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark 
> master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot 
> terminate at any time. From checking aws logs: the worker was terminated at 
> 01:53:38
>  
> I think you can replicate this by tearing down worker machine while app is 
> running. You might have to try several times.
>  
> Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-32040) Idle cores not being allocated

2020-06-20 Thread t oo (Jira)
t oo created SPARK-32040:


 Summary: Idle cores not being allocated
 Key: SPARK-32040
 URL: https://issues.apache.org/jira/browse/SPARK-32040
 Project: Spark
  Issue Type: Bug
  Components: Scheduler
Affects Versions: 2.4.5
Reporter: t oo


Background: 
I have a cluster (2.4.5) using standalone mode orchestrated by Nomad jobs 
running on EC2. We deploy a Scala web server as a long running jar via 
`spark-submit` in client mode. Sometimes we get into a state where the 
application ends up with 0 cores due to our in-house autoscaler scaling down 
and killing workers without checking if any of the cores in the worker were 
allocated to existing applications. These applications then end up with 0 
cores, even though there are healthy workers in the cluster. 

However, if i submit a new application or register a new worker in the 
cluster, only then will the master finally reallocate cores to the 
application. This is problematic, because the long running 0 core 
application is stuck. 

Could this be related to the fact that `schedule()` is only triggered by new 
workers / new applications as commented here? 
[https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L721-L724]

If that is the case, should the application be calling `schedule()` when 
removing workers after calling `timeOutWorkers()`? 
[https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L417]

The downscaling causes me to see this in my logs, so i am fairly certain 
`timeOutWorkers()` is being called: 
``` 
20/06/08 11:40:56 INFO Master: Application app-20200608114056-0006 requested 
to set total executors to 1. 
20/06/08 11:40:56 INFO Master: Launching executor app-20200608114056-0006/0 
on worker worker-20200608113523--7077 
20/06/08 11:41:44 WARN Master: Removing 
worker-20200608113523--7077 because we got no heartbeat in 60 
seconds 
20/06/08 11:41:44 INFO Master: Removing worker 
worker-20200608113523--7077 on :7077 
20/06/08 11:41:44 INFO Master: Telling app of lost executor: 0 
20/06/08 11:41:44 INFO Master: Telling app of lost worker: 
worker-20200608113523-10.158.242.213-7077 
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2020-04-29 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095434#comment-17095434
 ] 

t oo commented on SPARK-29037:
--

with spark 2.3.4 and hadoop 2.8.5: i am facing this doing simple Overwrite mode 
(not dynamic overwrite) and 
"spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version" = 2. 

Scenario: 
1. successful spark run = single data, 
2. successful spark run = single data, 
3. aborted spark run during write stage, 
4. successful spark run = dupe data, 
5. successful spark run = triple data...etc


> [Core] Spark gives duplicate result when an application was killed and rerun
> 
>
> Key: SPARK-29037
> URL: https://issues.apache.org/jira/browse/SPARK-29037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.3.3
>Reporter: feiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> For InsertIntoHadoopFsRelation operations.
> Case A:
> Application appA insert overwrite table table_a with static partition 
> overwrite.
> But it was killed when committing tasks, because one task is hang.
> And parts of its committed tasks output is kept under 
> /path/table_a/_temporary/0/.
> Then we rerun appA. It will reuse the staging dir /path/table_a/_temporary/0/.
> It executes successfully.
> But it also commit the data reminded by killed application to destination dir.
> Case B:
> Application appA insert overwrite table table_a.
> Application appB insert overwrite table table_a, too.
> They execute concurrently, and they may all use /path/table_a/_temporary/0/ 
> as workPath.
> And their result may be corruptted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24617) Spark driver not requesting another executor once original executor exits due to 'lost worker'

2020-03-01 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-24617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048627#comment-17048627
 ] 

t oo commented on SPARK-24617:
--

same problem in spark 2.3.4

> Spark driver not requesting another executor once original executor exits due 
> to 'lost worker'
> --
>
> Key: SPARK-24617
> URL: https://issues.apache.org/jira/browse/SPARK-24617
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.1.1
>Reporter: t oo
>Priority: Major
>  Labels: bulk-closed
>
> I am running Spark v2.1.1 in 'standalone' mode (no yarn/mesos) across EC2s. I 
> have 1 master ec2 that acts as the driver (since spark-submit is called on 
> this host), spark.master is setup, deploymode is client (so sparksubmit only 
> returns a ReturnCode to the putty window once it finishes processing). I have 
> 1 worker ec2 that is registered with the spark master. When i run sparksubmit 
> on the master, I can see in the WebUI that executors starting on the worker 
> and I can verify successful completion. However if while the sparksubmit is 
> running and the worker ec2 gets terminated and then new ec2 worker becomes 
> alive 3mins later and registers with the master, I have noticed on the webui 
> that it shows 'cannot find address' in the executor status but the driver 
> keeps waiting forever (2 days later I kill it) or in some cases the driver 
> allocates tasks to the new worker only 5 hours later and then completes! Is 
> there some setting i am missing that would explain this behavior?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30561) start spark applications without a 30second startup penalty

2020-02-25 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045015#comment-17045015
 ] 

t oo commented on SPARK-30561:
--

[~rxin] looks like you committed val SLEEP_TIME_SECS = 5 in BlockManager, you 
don't happen to recall why?

> start spark applications without a 30second startup penalty
> ---
>
> Key: SPARK-30561
> URL: https://issues.apache.org/jira/browse/SPARK-30561
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Major
>
> see 
> https://stackoverflow.com/questions/57610138/how-to-start-spark-applications-without-a-30second-startup-penalty
> using spark standalone.
> There are several sleeps that can be removed:
> grep -i 'sleep(' -R * | grep -v 'src/test/' | grep -E '^core' | grep -ivE 
> 'mesos|yarn|python|HistoryServer|spark/ui/'
> core/src/main/scala/org/apache/spark/util/Clock.scala:  
> Thread.sleep(sleepTime)
> core/src/main/scala/org/apache/spark/SparkContext.scala:   * sc.parallelize(1 
> to 1, 2).map { i => Thread.sleep(10); i }.count()
> core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala:  
> private def delay(secs: Duration = 5.seconds) = Thread.sleep(secs.toMillis)
> core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: 
>  Thread.sleep(1000)
> core/src/main/scala/org/apache/spark/deploy/Client.scala:
> Thread.sleep(5000)
> core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala:  
> Thread.sleep(100)
> core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala:
>   Thread.sleep(duration)
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:def 
> sleep(seconds: Int): Unit = (0 until seconds).takeWhile { _ =>
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:  
> Thread.sleep(1000)
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:
> sleeper.sleep(waitSeconds)
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:  def 
> sleep(seconds: Int): Unit
> core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala:  
> Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL)
> core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala:  
> Thread.sleep(10)
> core/src/main/scala/org/apache/spark/storage/BlockManager.scala:  
> Thread.sleep(SLEEP_TIME_SECS * 1000L)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30561) start spark applications without a 30second startup penalty

2020-02-25 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045013#comment-17045013
 ] 

t oo commented on SPARK-30561:
--

[~pwendell] looks like you added the sleep(5000), you don't happen to recall 
why?

> start spark applications without a 30second startup penalty
> ---
>
> Key: SPARK-30561
> URL: https://issues.apache.org/jira/browse/SPARK-30561
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Major
>
> see 
> https://stackoverflow.com/questions/57610138/how-to-start-spark-applications-without-a-30second-startup-penalty
> using spark standalone.
> There are several sleeps that can be removed:
> grep -i 'sleep(' -R * | grep -v 'src/test/' | grep -E '^core' | grep -ivE 
> 'mesos|yarn|python|HistoryServer|spark/ui/'
> core/src/main/scala/org/apache/spark/util/Clock.scala:  
> Thread.sleep(sleepTime)
> core/src/main/scala/org/apache/spark/SparkContext.scala:   * sc.parallelize(1 
> to 1, 2).map { i => Thread.sleep(10); i }.count()
> core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala:  
> private def delay(secs: Duration = 5.seconds) = Thread.sleep(secs.toMillis)
> core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: 
>  Thread.sleep(1000)
> core/src/main/scala/org/apache/spark/deploy/Client.scala:
> Thread.sleep(5000)
> core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala:  
> Thread.sleep(100)
> core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala:
>   Thread.sleep(duration)
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:def 
> sleep(seconds: Int): Unit = (0 until seconds).takeWhile { _ =>
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:  
> Thread.sleep(1000)
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:
> sleeper.sleep(waitSeconds)
> core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:  def 
> sleep(seconds: Int): Unit
> core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala:  
> Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL)
> core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala:  
> Thread.sleep(10)
> core/src/main/scala/org/apache/spark/storage/BlockManager.scala:  
> Thread.sleep(SLEEP_TIME_SECS * 1000L)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30610) spark worker graceful shutdown

2020-02-25 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044974#comment-17044974
 ] 

t oo commented on SPARK-30610:
--

>From [~holdenkarau]: In its present state it could help, you'd call decom 
>instead of stop. But we'd probably want to see the last step to fully consider 
>https://issues.apache.org/jira/browse/SPARK-30610 solved, right now it won't 
>schedule any new jobs but won't exit & shutdown automatically (you can use a 
>timer and sort of approximate it but it's not perfect).

> spark worker graceful shutdown
> --
>
> Key: SPARK-30610
> URL: https://issues.apache.org/jira/browse/SPARK-30610
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 3.1.0
>Reporter: t oo
>Priority: Minor
>
> I am not talking about spark streaming! just regular batch jobs using 
> spark-submit that may try to read large csv (100+gb) then write it out as 
> parquet. In an autoscaling cluster would be nice to be able to scale down (ie 
> terminate) ec2s without slowing down active spark applications.
> for example:
> 1. start spark cluster with 8 ec2s
> 2. submit 6 spark apps
> 3. 1 spark app completes, so 5 apps still running
> 4. cluster can scale down 1 ec2 (to save $) but don't want to make the 
> existing apps running on the (soon to be terminated) ec2 have to make its csv 
> read, RDD processing steps.etc start from the beginning on different ec2's 
> executors. Instead want to have a 'graceful shutdown' command so that the 8th 
> ec2 does not accept new spark-submit apps to it (ie don't start new executors 
> on it) but finish the ones that have already launched on it, then exit the 
> worker pid. then the ec2 can be terminated
> I thought stop-slave.sh could do this but looks like it just kills the pid



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30610) spark worker graceful shutdown

2020-01-22 Thread t oo (Jira)
t oo created SPARK-30610:


 Summary: spark worker graceful shutdown
 Key: SPARK-30610
 URL: https://issues.apache.org/jira/browse/SPARK-30610
 Project: Spark
  Issue Type: Improvement
  Components: Scheduler
Affects Versions: 2.4.4
Reporter: t oo


I am not talking about spark streaming! just regular batch jobs using 
spark-submit that may try to read large csv (100+gb) then write it out as 
parquet. In an autoscaling cluster would be nice to be able to scale down (ie 
terminate) ec2s without slowing down active spark applications.

for example:
1. start spark cluster with 8 ec2s
2. submit 6 spark apps
3. 1 spark app completes, so 5 apps still running
4. cluster can scale down 1 ec2 (to save $) but don't want to make the existing 
apps running on the (soon to be terminated) ec2 have to make its csv read, RDD 
processing steps.etc start from the beginning on different ec2's executors. 
Instead want to have a 'graceful shutdown' command so that the 8th ec2 does not 
accept new spark-submit apps to it (ie don't start new executors on it) but 
finish the ones that have already launched on it, then exit the worker pid. 
then the ec2 can be terminated


I thought stop-slave.sh could do this but looks like it just kills the pid




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30561) start spark applications without a 30second startup penalty

2020-01-18 Thread t oo (Jira)
t oo created SPARK-30561:


 Summary: start spark applications without a 30second startup 
penalty
 Key: SPARK-30561
 URL: https://issues.apache.org/jira/browse/SPARK-30561
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.4.4
Reporter: t oo


see 
https://stackoverflow.com/questions/57610138/how-to-start-spark-applications-without-a-30second-startup-penalty

using spark standalone.

There are several sleeps that can be removed:

grep -i 'sleep(' -R * | grep -v 'src/test/' | grep -E '^core' | grep -ivE 
'mesos|yarn|python|HistoryServer|spark/ui/'
core/src/main/scala/org/apache/spark/util/Clock.scala:  
Thread.sleep(sleepTime)
core/src/main/scala/org/apache/spark/SparkContext.scala:   * sc.parallelize(1 
to 1, 2).map { i => Thread.sleep(10); i }.count()
core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala:  private 
def delay(secs: Duration = 5.seconds) = Thread.sleep(secs.toMillis)
core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala:  
Thread.sleep(1000)
core/src/main/scala/org/apache/spark/deploy/Client.scala:Thread.sleep(5000)
core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala:  
Thread.sleep(100)
core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala:  
Thread.sleep(duration)
core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:def 
sleep(seconds: Int): Unit = (0 until seconds).takeWhile { _ =>
core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:  
Thread.sleep(1000)
core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:
sleeper.sleep(waitSeconds)
core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:  def 
sleep(seconds: Int): Unit
core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala:
  Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL)
core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala:  
Thread.sleep(10)
core/src/main/scala/org/apache/spark/storage/BlockManager.scala:  
Thread.sleep(SLEEP_TIME_SECS * 1000L)




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30560) allow driver to consume a fractional core

2020-01-18 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-30560:
-
Description: 
see 
https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste

this is to make it possible for a driver to use 0.2 cores rather than a whole 
core

Standard CPUs, no GPUs

  was:
see 
https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste

this is to make it possible for a driver to use 0.2 cores rather than a whole 
core


> allow driver to consume a fractional core
> -
>
> Key: SPARK-30560
> URL: https://issues.apache.org/jira/browse/SPARK-30560
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: 2.4.4
>Reporter: t oo
>Priority: Minor
>
> see 
> https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste
> this is to make it possible for a driver to use 0.2 cores rather than a whole 
> core
> Standard CPUs, no GPUs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30560) allow driver to consume a fractional core

2020-01-18 Thread t oo (Jira)
t oo created SPARK-30560:


 Summary: allow driver to consume a fractional core
 Key: SPARK-30560
 URL: https://issues.apache.org/jira/browse/SPARK-30560
 Project: Spark
  Issue Type: Improvement
  Components: Scheduler
Affects Versions: 2.4.4
Reporter: t oo


see 
https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste

this is to make it possible for a driver to use 0.2 cores rather than a whole 
core



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2020-01-18 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018544#comment-17018544
 ] 

t oo commented on SPARK-27750:
--

did some digging, let me know if I'm on right track.

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L785
 ---> why precedence comment? i want the opposite

in private def schedule(): Unit = {
add:
rawFreeCores = shuffledAliveWorkers.map(_.coresFree).sum
cores_reserved_for_apps = *get from config somehow* ie 8
forDriversFreeCores = math.max(rawFreeCores-cores_reserved_for_apps,0)
then wrap the inner steps in:
if forDriversFreeCores >= driver.desc.cores
{
canLaunchDriver
}
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L796

#just note about driver/exec requests
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L746-L775



> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2020-01-16 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017448#comment-17017448
 ] 

t oo commented on SPARK-27750:
--

yes, I hit this. 'spark standalone' is the RM, it has no pools. I am thinking 
one way to do is new config, cores_reserved_for_apps=8 (changeable), then those 
8 cores cannot be consumed by drivers

> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2020-01-15 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249
 ] 

t oo edited comment on SPARK-27750 at 1/16/20 6:33 AM:
---

WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] 
[~zsxwing] [~jlaskowski] [~cloud_fan] [~srowen] [~dongjoon] [~hyukjin.kwon]


was (Author: toopt4):
WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] 
[~zsxwing] [~jlaskowski] [~cloud_fan]

> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2020-01-15 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249
 ] 

t oo edited comment on SPARK-27750 at 1/15/20 11:36 PM:


WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] 
[~zsxwing] [~jlaskowski] [~cloud_fan]


was (Author: toopt4):
WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] 
[~zsxwing] [~jlaskowski]

> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2020-01-11 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249
 ] 

t oo edited comment on SPARK-27750 at 1/11/20 12:59 PM:


WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] 
[~zsxwing] [~jlaskowski]


was (Author: toopt4):
bump

 

> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-27429) [SQL] to_timestamp function with additional argument flag that will allow exception if value could not be cast

2019-12-25 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo reopened SPARK-27429:
--

presto has both functionality, 
https://prestosql.io/docs/current/functions/conditional.html
by default they fail if bad data, to get null if bad data wrap in TRY().
I get user data to ingest first as string format, then I want to conform it to 
casted data types with sql but right now I don't get errors back!

> [SQL] to_timestamp function with additional argument flag that will allow 
> exception if value could not be cast
> --
>
> Key: SPARK-27429
> URL: https://issues.apache.org/jira/browse/SPARK-27429
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.1
>Reporter: t oo
>Priority: Major
>
> If I am running a SQL on a csv based dataframe and my query has  
> to_timestamp(input_col,'-MM-dd HH:mm:ss'), if the values in input_col are 
> not really timestamp like 'ABC' then I would like to_timestamp function to 
> throw an exception rather than happily (silently) return the values as null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

2019-12-25 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003393#comment-17003393
 ] 

t oo commented on SPARK-27491:
--

[~skonto] curl works
curl http://spark-master.corp.com:6066/v1/submissions/status/
but spark-submit --status does not

> SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty 
> response! therefore Airflow won't integrate with Spark 2.3.x
> --
>
> Key: SPARK-27491
> URL: https://issues.apache.org/jira/browse/SPARK-27491
> Project: Spark
>  Issue Type: Bug
>  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark 
> Submit
>Affects Versions: 2.3.3, 2.4.4
>Reporter: t oo
>Priority: Major
>
> This issue must have been introduced after Spark 2.1.1 as it is working in 
> that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using 
> spark standalone mode if that makes a difference.
> See below spark 2.3.3 returns empty response while 2.1.1 returns a response.
>  
> Spark 2.1.1:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + export SPARK_HOME=/home/ec2here/spark_home1
> + SPARK_HOME=/home/ec2here/spark_home1
> + '[' -z /home/ec2here/spark_home1 ']'
> + . /home/ec2here/spark_home1/bin/load-spark-env.sh
> ++ '[' -z /home/ec2here/spark_home1 ']'
> ++ '[' -z '' ']'
> ++ export SPARK_ENV_LOADED=1
> ++ SPARK_ENV_LOADED=1
> ++ parent_dir=/home/ec2here/spark_home1
> ++ user_conf_dir=/home/ec2here/spark_home1/conf
> ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
> ++ set -a
> ++ . /home/ec2here/spark_home1/conf/spark-env.sh
> +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
> +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
>  ulimit -n 1048576
> ++ set +a
> ++ '[' -z '' ']'
> ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
> ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
> ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
> ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
> ++ export SPARK_SCALA_VERSION=2.10
> ++ SPARK_SCALA_VERSION=2.10
> + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
> + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
> + '[' -d /home/ec2here/spark_home1/jars ']'
> + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
> + '[' '!' -d /home/ec2here/spark_home1/jars ']'
> + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
> + '[' -n '' ']'
> + [[ -n '' ]]
> + CMD=()
> + IFS=
> + read -d '' -r ARG
> ++ build_command org.apache.spark.deploy.SparkSubmit --master 
> spark://domainhere:6066 --status driver-20190417130324-0009
> ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
> '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> ++ printf '%d\0' 0
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + COUNT=10
> + LAST=9
> + LAUNCHER_EXIT_CODE=0
> + [[ 0 =~ ^[0-9]+$ ]]
> + '[' 0 '!=' 0 ']'
> + CMD=("${CMD[@]:0:$LAST}")
> + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
> '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
> status of submission driver-20190417130324-0009 in spark://domainhere:6066.
> 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
> SubmissionStatusResponse:
> {
>  "action" : "SubmissionStatusResponse",
>  "driverState" : "FAILED",
>  "serverSparkVersion" : "2.3.3",
>  "submissionId" : "driver-20190417130324-0009",
>  "success" : true,
>  "workerHostPort" : "x.y.211.40:11819",
>  "workerId" : "worker-20190417115840-x.y.211.40-11819"
> }
> [ec2here@ip-x-y-160-225 ~]$
>  
> Spark 2.3.3:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> 

[jira] [Updated] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

2019-12-25 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-27491:
-
Affects Version/s: 2.4.4

> SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty 
> response! therefore Airflow won't integrate with Spark 2.3.x
> --
>
> Key: SPARK-27491
> URL: https://issues.apache.org/jira/browse/SPARK-27491
> Project: Spark
>  Issue Type: Bug
>  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark 
> Submit
>Affects Versions: 2.3.3, 2.4.4
>Reporter: t oo
>Priority: Major
>
> This issue must have been introduced after Spark 2.1.1 as it is working in 
> that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using 
> spark standalone mode if that makes a difference.
> See below spark 2.3.3 returns empty response while 2.1.1 returns a response.
>  
> Spark 2.1.1:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + export SPARK_HOME=/home/ec2here/spark_home1
> + SPARK_HOME=/home/ec2here/spark_home1
> + '[' -z /home/ec2here/spark_home1 ']'
> + . /home/ec2here/spark_home1/bin/load-spark-env.sh
> ++ '[' -z /home/ec2here/spark_home1 ']'
> ++ '[' -z '' ']'
> ++ export SPARK_ENV_LOADED=1
> ++ SPARK_ENV_LOADED=1
> ++ parent_dir=/home/ec2here/spark_home1
> ++ user_conf_dir=/home/ec2here/spark_home1/conf
> ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
> ++ set -a
> ++ . /home/ec2here/spark_home1/conf/spark-env.sh
> +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
> +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
>  ulimit -n 1048576
> ++ set +a
> ++ '[' -z '' ']'
> ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
> ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
> ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
> ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
> ++ export SPARK_SCALA_VERSION=2.10
> ++ SPARK_SCALA_VERSION=2.10
> + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
> + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
> + '[' -d /home/ec2here/spark_home1/jars ']'
> + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
> + '[' '!' -d /home/ec2here/spark_home1/jars ']'
> + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
> + '[' -n '' ']'
> + [[ -n '' ]]
> + CMD=()
> + IFS=
> + read -d '' -r ARG
> ++ build_command org.apache.spark.deploy.SparkSubmit --master 
> spark://domainhere:6066 --status driver-20190417130324-0009
> ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
> '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> ++ printf '%d\0' 0
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + COUNT=10
> + LAST=9
> + LAUNCHER_EXIT_CODE=0
> + [[ 0 =~ ^[0-9]+$ ]]
> + '[' 0 '!=' 0 ']'
> + CMD=("${CMD[@]:0:$LAST}")
> + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
> '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
> status of submission driver-20190417130324-0009 in spark://domainhere:6066.
> 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
> SubmissionStatusResponse:
> {
>  "action" : "SubmissionStatusResponse",
>  "driverState" : "FAILED",
>  "serverSparkVersion" : "2.3.3",
>  "submissionId" : "driver-20190417130324-0009",
>  "success" : true,
>  "workerHostPort" : "x.y.211.40:11819",
>  "workerId" : "worker-20190417115840-x.y.211.40-11819"
> }
> [ec2here@ip-x-y-160-225 ~]$
>  
> Spark 2.3.3:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + '[' -z '' ']'
> ++ dirname /home/ec2here/spark_home/bin/spark-class
> + source 

[jira] [Updated] (SPARK-30251) faster way to read csv.gz?

2019-12-13 Thread t oo (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-30251:
-
Description: 
some data providers give files in csv.gz (ie 1gb compressed which is 25gb 
uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed 
which is 2.5gb uncompressed), now when i tell my boss that famous big data tool 
spark takes 16hrs to convert the 1gb compressed into parquet then there is look 
of shock. this is batch data we receive daily (80gb compressed, 2tb 
uncompressed every day spread across ~300 files).

i know gz is not splittable so it ends up loaded on single worker. but we dont 
have space/patience to do a pre-conversion to bz2 or uncompressed. can spark 
have a better codec? i saw posts mentioning even python is faster than spark

 

[https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark]

[https://github.com/nielsbasjes/splittablegzip]

 

 

  was:
some data providers give files in csv.gz (ie 1gb compressed which is 25gb 
uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed 
which is 2.5gb uncompressed), now when i tell my boss that famous big data tool 
spark takes 16hrs to convert the 1gb compressed into parquet then there is look 
of shock. this is batch data we receive daily (80gb compressed, 2tb 
uncompressed every day spread across ~300 files).

i know gz is not splittable so currently loaded on single worker. but we dont 
have space/patience to do a pre-conversion to bz2 or uncompressed. can spark 
have a better codec? i saw posts mentioning even python is faster than spark

 

[https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark]

[https://github.com/nielsbasjes/splittablegzip]

 

 


> faster way to read csv.gz?
> --
>
> Key: SPARK-30251
> URL: https://issues.apache.org/jira/browse/SPARK-30251
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 2.4.4
>Reporter: t oo
>Priority: Major
>
> some data providers give files in csv.gz (ie 1gb compressed which is 25gb 
> uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed 
> which is 2.5gb uncompressed), now when i tell my boss that famous big data 
> tool spark takes 16hrs to convert the 1gb compressed into parquet then there 
> is look of shock. this is batch data we receive daily (80gb compressed, 2tb 
> uncompressed every day spread across ~300 files).
> i know gz is not splittable so it ends up loaded on single worker. but we 
> dont have space/patience to do a pre-conversion to bz2 or uncompressed. can 
> spark have a better codec? i saw posts mentioning even python is faster than 
> spark
>  
> [https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark]
> [https://github.com/nielsbasjes/splittablegzip]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30251) faster way to read csv.gz?

2019-12-13 Thread t oo (Jira)
t oo created SPARK-30251:


 Summary: faster way to read csv.gz?
 Key: SPARK-30251
 URL: https://issues.apache.org/jira/browse/SPARK-30251
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Affects Versions: 2.4.4
Reporter: t oo


some data providers give files in csv.gz (ie 1gb compressed which is 25gb 
uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed 
which is 2.5gb uncompressed), now when i tell my boss that famous big data tool 
spark takes 16hrs to convert the 1gb compressed into parquet then there is look 
of shock. this is batch data we receive daily (80gb compressed, 2tb 
uncompressed every day spread across ~300 files).

i know gz is not splittable so currently loaded on single worker. but we dont 
have space/patience to do a pre-conversion to bz2 or uncompressed. can spark 
have a better codec? i saw posts mentioning even python is faster than spark

 

[https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark]

[https://github.com/nielsbasjes/splittablegzip]

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26346) Upgrade parquet to 1.11.0

2019-12-08 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991165#comment-16991165
 ] 

t oo commented on SPARK-26346:
--

https://www.apache.org/dist/parquet/apache-parquet-1.11.0/       
http://mail-archives.apache.org/mod_mbox/parquet-dev/201912.mbox/browser.  
released

> Upgrade parquet to 1.11.0
> -
>
> Key: SPARK-26346
> URL: https://issues.apache.org/jira/browse/SPARK-26346
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30162) Filter is not being pushed down for Parquet files

2019-12-07 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990725#comment-16990725
 ] 

t oo commented on SPARK-30162:
--

did u try on scala sparkshell?

> Filter is not being pushed down for Parquet files
> -
>
> Key: SPARK-30162
> URL: https://issues.apache.org/jira/browse/SPARK-30162
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
> Environment: pyspark 3.0 preview
> Ubuntu/Centos
> pyarrow 0.14.1 
>Reporter: Nasir Ali
>Priority: Major
>
> Filters are not pushed down in Spark 3.0 preview. Also the output of 
> "explain" method is different. It is hard to debug in 3.0 whether filters 
> were pushed down or not. Below code could reproduce the bug:
>  
> {code:java}
> // code placeholder
> df = spark.createDataFrame([("usr1",17.00, "2018-03-10T15:27:18+00:00"),
> ("usr1",13.00, "2018-03-11T12:27:18+00:00"),
> ("usr1",25.00, "2018-03-12T11:27:18+00:00"),
> ("usr1",20.00, "2018-03-13T15:27:18+00:00"),
> ("usr1",17.00, "2018-03-14T12:27:18+00:00"),
> ("usr2",99.00, "2018-03-15T11:27:18+00:00"),
> ("usr2",156.00, "2018-03-22T11:27:18+00:00"),
> ("usr2",17.00, "2018-03-31T11:27:18+00:00"),
> ("usr2",25.00, "2018-03-15T11:27:18+00:00"),
> ("usr2",25.00, "2018-03-16T11:27:18+00:00")
> ],
>["user","id", "ts"])
> df = df.withColumn('ts', df.ts.cast('timestamp'))
> df.write.partitionBy("user").parquet("/home/cnali/data/")df2 = 
> spark.read.load("/home/cnali/data/")df2.filter("user=='usr2'").explain(True)
> {code}
> {code:java}
> // Spark 2.4 output
> == Parsed Logical Plan ==
> 'Filter ('user = usr2)
> +- Relation[id#38,ts#39,user#40] parquet== Analyzed Logical Plan ==
> id: double, ts: timestamp, user: string
> Filter (user#40 = usr2)
> +- Relation[id#38,ts#39,user#40] parquet== Optimized Logical Plan ==
> Filter (isnotnull(user#40) && (user#40 = usr2))
> +- Relation[id#38,ts#39,user#40] parquet== Physical Plan ==
> *(1) FileScan parquet [id#38,ts#39,user#40] Batched: true, Format: Parquet, 
> Location: InMemoryFileIndex[file:/home/cnali/data], PartitionCount: 1, 
> PartitionFilters: [isnotnull(user#40), (user#40 = usr2)], PushedFilters: [], 
> ReadSchema: struct{code}
> {code:java}
> // Spark 3.0.0-preview output
> == Parsed Logical Plan ==
> 'Filter ('user = usr2)
> +- RelationV2[id#0, ts#1, user#2] parquet file:/home/cnali/data== Analyzed 
> Logical Plan ==
> id: double, ts: timestamp, user: string
> Filter (user#2 = usr2)
> +- RelationV2[id#0, ts#1, user#2] parquet file:/home/cnali/data== Optimized 
> Logical Plan ==
> Filter (isnotnull(user#2) AND (user#2 = usr2))
> +- RelationV2[id#0, ts#1, user#2] parquet file:/home/cnali/data== Physical 
> Plan ==
> *(1) Project [id#0, ts#1, user#2]
> +- *(1) Filter (isnotnull(user#2) AND (user#2 = usr2))
>+- *(1) ColumnarToRow
>   +- BatchScan[id#0, ts#1, user#2] ParquetScan Location: 
> InMemoryFileIndex[file:/home/cnali/data], ReadSchema: 
> struct
> {code}
> I have tested it on much larger dataset. Spark 3.0 tries to load whole data 
> and then apply filter. Whereas Spark 2.4 push down the filter. Above output 
> shows that Spark 2.4 applied partition filter but not the Spark 3.0 preview.
>  
> Minor: in Spark 3.0 "explain()" output is truncated (maybe fixed length?) and 
> it's hard to debug.  spark.sql.orc.cache.stripe.details.size=1 doesn't 
> work.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27623) Provider org.apache.spark.sql.avro.AvroFileFormat could not be instantiated

2019-12-07 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990723#comment-16990723
 ] 

t oo commented on SPARK-27623:
--

bump

> Provider org.apache.spark.sql.avro.AvroFileFormat could not be instantiated
> ---
>
> Key: SPARK-27623
> URL: https://issues.apache.org/jira/browse/SPARK-27623
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.2
>Reporter: Alexandru Barbulescu
>Priority: Major
>
> After updating to spark 2.4.2 when using the 
> {code:java}
> spark.read.format().options().load()
> {code}
>  
> chain of methods, regardless of what parameter is passed to "format" we get 
> the following error related to avro:
>  
> {code:java}
> - .options(**load_options)
> - File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 
> 172, in load
> - File "/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 
> 1257, in __call__
> - File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in 
> deco
> - File "/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 
> 328, in get_return_value
> - py4j.protocol.Py4JJavaError: An error occurred while calling o69.load.
> - : java.util.ServiceConfigurationError: 
> org.apache.spark.sql.sources.DataSourceRegister: Provider 
> org.apache.spark.sql.avro.AvroFileFormat could not be instantiated
> - at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> - at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> - at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> - at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> - at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> - at 
> scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44)
> - at scala.collection.Iterator.foreach(Iterator.scala:941)
> - at scala.collection.Iterator.foreach$(Iterator.scala:941)
> - at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
> - at scala.collection.IterableLike.foreach(IterableLike.scala:74)
> - at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
> - at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
> - at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:250)
> - at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:248)
> - at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108)
> - at scala.collection.TraversableLike.filter(TraversableLike.scala:262)
> - at scala.collection.TraversableLike.filter$(TraversableLike.scala:262)
> - at scala.collection.AbstractTraversable.filter(Traversable.scala:108)
> - at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:630)
> - at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
> - at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
> - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> - at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> - at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> - at java.lang.reflect.Method.invoke(Method.java:498)
> - at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> - at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> - at py4j.Gateway.invoke(Gateway.java:282)
> - at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> - at py4j.commands.CallCommand.execute(CallCommand.java:79)
> - at py4j.GatewayConnection.run(GatewayConnection.java:238)
> - at java.lang.Thread.run(Thread.java:748)
> - Caused by: java.lang.NoClassDefFoundError: 
> org/apache/spark/sql/execution/datasources/FileFormat$class
> - at org.apache.spark.sql.avro.AvroFileFormat.(AvroFileFormat.scala:44)
> - at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> - at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> - at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> - at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> - at java.lang.Class.newInstance(Class.java:442)
> - at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> - ... 29 more
> - Caused by: java.lang.ClassNotFoundException: 
> org.apache.spark.sql.execution.datasources.FileFormat$class
> - at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> - at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> - at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> - at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> - ... 36 more
> {code}
>  
> The code we run looks like 

[jira] [Commented] (SPARK-26346) Upgrade parquet to 1.11.0

2019-12-07 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990722#comment-16990722
 ] 

t oo commented on SPARK-26346:
--

bump

> Upgrade parquet to 1.11.0
> -
>
> Key: SPARK-26346
> URL: https://issues.apache.org/jira/browse/SPARK-26346
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26091) Upgrade to 2.3.4 for Hive Metastore Client 2.3

2019-12-06 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989616#comment-16989616
 ] 

t oo commented on SPARK-26091:
--

but i dont use 2.3.3

> Upgrade to 2.3.4 for Hive Metastore Client 2.3
> --
>
> Key: SPARK-26091
> URL: https://issues.apache.org/jira/browse/SPARK-26091
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26091) Upgrade to 2.3.4 for Hive Metastore Client 2.3

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989461#comment-16989461
 ] 

t oo commented on SPARK-26091:
--

i just want to access hivemetaxtore 2.3.4

> Upgrade to 2.3.4 for Hive Metastore Client 2.3
> --
>
> Key: SPARK-26091
> URL: https://issues.apache.org/jira/browse/SPARK-26091
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26091) Upgrade to 2.3.4 for Hive Metastore Client 2.3

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989256#comment-16989256
 ] 

t oo commented on SPARK-26091:
--

can this go in spark 2.4.5 ?

 

> Upgrade to 2.3.4 for Hive Metastore Client 2.3
> --
>
> Key: SPARK-26091
> URL: https://issues.apache.org/jira/browse/SPARK-26091
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22860) Spark workers log ssl passwords passed to the executors

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-22860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989255#comment-16989255
 ] 

t oo commented on SPARK-22860:
--

[~kabhwan] can this go in 2.4.5?

> Spark workers log ssl passwords passed to the executors
> ---
>
> Key: SPARK-22860
> URL: https://issues.apache.org/jira/browse/SPARK-22860
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Felix K.
>Assignee: Jungtaek Lim
>Priority: Major
> Fix For: 3.0.0
>
>
> The workers log the spark.ssl.keyStorePassword and 
> spark.ssl.trustStorePassword passed by cli to the executor processes. The 
> ExecutorRunner should escape passwords to not appear in the worker's log 
> files in INFO level. In this example, you can see my 'SuperSecretPassword' in 
> a worker log:
> {code}
> 17/12/08 08:04:12 INFO ExecutorRunner: Launch command: 
> "/global/myapp/oem/jdk/bin/java" "-cp" 
> "/global/myapp/application/myapp_software/thing_loader_lib/core-repository-model-zzz-1.2.3-SNAPSHOT.jar
> [...]
> :/global/myapp/application/spark-2.1.1-bin-hadoop2.7/jars/*" "-Xmx16384M" 
> "-Dspark.authenticate.enableSaslEncryption=true" 
> "-Dspark.ssl.keyStorePassword=SuperSecretPassword" 
> "-Dspark.ssl.keyStore=/global/myapp/application/config/ssl/keystore.jks" 
> "-Dspark.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks" 
> "-Dspark.ssl.enabled=true" "-Dspark.driver.port=39927" 
> "-Dspark.ssl.protocol=TLS" 
> "-Dspark.ssl.trustStorePassword=SuperSecretPassword" 
> "-Dspark.authenticate=true" "-Dmyapp_IMPORT_DATE=2017-10-30" 
> "-Dmyapp.config.directory=/global/myapp/application/config" 
> "-Dsolr.httpclient.builder.factory=com.company.myapp.loader.auth.LoaderConfigSparkSolrBasicAuthConfigurer"
>  
> "-Djavax.net.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks"
>  "-XX:+UseG1GC" "-XX:+UseStringDeduplication" 
> "-Dthings.loader.export.zzz_files=false" 
> "-Dlog4j.configuration=file:/global/myapp/application/config/spark-executor-log4j.properties"
>  "-XX:+HeapDumpOnOutOfMemoryError" "-XX:+UseStringDeduplication" 
> "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
> "spark://CoarseGrainedScheduler@192.168.0.1:39927" "--executor-id" "2" 
> "--hostname" "192.168.0.1" "--cores" "4" "--app-id" "app-20171208080412-" 
> "--worker-url" "spark://Worker@192.168.0.1:59530"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989254#comment-16989254
 ] 

t oo commented on SPARK-23534:
--

close?

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24590) Make Jenkins tests passed with hadoop 3 profile

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989253#comment-16989253
 ] 

t oo commented on SPARK-24590:
--

close?

 

> Make Jenkins tests passed with hadoop 3 profile
> ---
>
> Key: SPARK-24590
> URL: https://issues.apache.org/jira/browse/SPARK-24590
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 2.4.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> Currently, some tests are being failed with hadoop-3 profile.
> Given PR builder 
> (https://github.com/apache/spark/pull/21441#issuecomment-397818337), it 
> reported:
> {code}
> org.apache.spark.sql.hive.HiveSparkSubmitSuite.SPARK-8020: set sql conf in 
> spark conf
> org.apache.spark.sql.hive.HiveSparkSubmitSuite.SPARK-9757 Persist Parquet 
> relation with decimal column
> org.apache.spark.sql.hive.HiveSparkSubmitSuite.ConnectionURL
> org.apache.spark.sql.hive.StatisticsSuite.SPARK-22745 - read Hive's 
> statistics for partition
> org.apache.spark.sql.hive.StatisticsSuite.alter table rename after analyze 
> table
> org.apache.spark.sql.hive.StatisticsSuite.alter table SET TBLPROPERTIES after 
> analyze table
> org.apache.spark.sql.hive.StatisticsSuite.alter table UNSET TBLPROPERTIES 
> after analyze table
> org.apache.spark.sql.hive.client.HiveClientSuites.(It is not a test it is a 
> sbt.testing.SuiteSelector)
> org.apache.spark.sql.hive.client.VersionsSuite.success sanity check
> org.apache.spark.sql.hive.client.VersionsSuite.hadoop configuration preserved 
> 75 ms
> org.apache.spark.sql.hive.client.VersionsSuite.*: * (roughly)
> org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite.basic DDL using 
> locale tr - caseSensitive true
> org.apache.spark.sql.hive.execution.HiveDDLSuite.create Hive-serde table and 
> view with unicode columns and comment
> org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER 
> TABLE for non-compatible DataSource tables
> org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER 
> TABLE for Hive-compatible DataSource tables
> org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER 
> TABLE for Hive tables
> org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER 
> TABLE with incompatible schema on Hive-compatible table
> org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.(It is not a test it is 
> a sbt.testing.SuiteSelector)
> org.apache.spark.sql.hive.execution.SQLQuerySuite.SPARK-18355 Read data from 
> a hive table with a new column - orc
> org.apache.spark.sql.hive.execution.SQLQuerySuite.SPARK-18355 Read data from 
> a hive table with a new column - parquet
> org.apache.spark.sql.hive.orc.HiveOrcSourceSuite.SPARK-19459/SPARK-18220: 
> read char/varchar column written by Hive
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989251#comment-16989251
 ] 

t oo edited comment on SPARK-5159 at 12/5/19 11:31 PM:
---

[~yumwang] does removal of hive fork solve this one?

 


was (Author: toopt4):
[~yumwang] does removal of hive fork soove this one?

 

> Thrift server does not respect hive.server2.enable.doAs=true
> 
>
> Key: SPARK-5159
> URL: https://issues.apache.org/jira/browse/SPARK-5159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.0
>Reporter: Andrew Ray
>Priority: Major
> Attachments: spark_thrift_server_log.txt
>
>
> I'm currently testing the spark sql thrift server on a kerberos secured 
> cluster in YARN mode. Currently any user can access any table regardless of 
> HDFS permissions as all data is read as the hive user. In HiveServer2 the 
> property hive.server2.enable.doAs=true causes all access to be done as the 
> submitting user. We should do the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989251#comment-16989251
 ] 

t oo commented on SPARK-5159:
-

[~yumwang] does removal of hive fork soove this one?

 

> Thrift server does not respect hive.server2.enable.doAs=true
> 
>
> Key: SPARK-5159
> URL: https://issues.apache.org/jira/browse/SPARK-5159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.0
>Reporter: Andrew Ray
>Priority: Major
> Attachments: spark_thrift_server_log.txt
>
>
> I'm currently testing the spark sql thrift server on a kerberos secured 
> cluster in YARN mode. Currently any user can access any table regardless of 
> HDFS permissions as all data is read as the hive user. In HiveServer2 the 
> property hive.server2.enable.doAs=true causes all access to be done as the 
> submitting user. We should do the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249
 ] 

t oo commented on SPARK-27750:
--

bump

 

> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27821) Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state

2019-12-05 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989248#comment-16989248
 ] 

t oo commented on SPARK-27821:
--

duration of running drivers missing too

> Spark WebUI - show numbers of drivers/apps in 
> waiting/submitted/killed/running state
> 
>
> Key: SPARK-27821
> URL: https://issues.apache.org/jira/browse/SPARK-27821
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Minor
> Attachments: webui.png
>
>
> The webui shows total number of apps/drivers in running/completed state. This 
> improvement is to show total number in following more fine-grained states: 
> waiting/submitted/killed/running/completed 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25088) Rest Server default & doc updates

2019-09-13 Thread t oo (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929103#comment-16929103
 ] 

t oo commented on SPARK-25088:
--

they are on different ports, now i have to maintain a patched fork

> Rest Server default & doc updates
> -
>
> Key: SPARK-25088
> URL: https://issues.apache.org/jira/browse/SPARK-25088
> Project: Spark
>  Issue Type: Improvement
>  Components: Deploy, Spark Core
>Affects Versions: 2.1.3, 2.2.2, 2.3.1, 2.4.0
>Reporter: Imran Rashid
>Assignee: Imran Rashid
>Priority: Major
>  Labels: release-notes
> Fix For: 2.4.0
>
>
> The rest server could use some updates on defaults & docs, both in standalone 
> and mesos.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28376) Support to write sorted parquet files in each row group

2019-07-15 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884883#comment-16884883
 ] 

t oo commented on SPARK-28376:
--

[~rdblue] can u add some color here?

> Support to write sorted parquet files in each row group
> ---
>
> Key: SPARK-28376
> URL: https://issues.apache.org/jira/browse/SPARK-28376
> Project: Spark
>  Issue Type: New Feature
>  Components: Input/Output, Spark Core
>Affects Versions: 2.4.3
>Reporter: t oo
>Priority: Major
>
> this is for the ability to writeee parquet with sorteed values in each 
> rowgroup
>  
> see 
> [https://stackoverflow.com/questions/52159938/cant-write-ordered-data-to-parquet-in-spark]
> [https://www.slideshare.net/RyanBlue3/parquet-performance-tuning-the-missing-guide]
>  (slidee 26-27)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28376) Write sorted parquet files

2019-07-12 Thread t oo (JIRA)
t oo created SPARK-28376:


 Summary: Write sorted parquet files
 Key: SPARK-28376
 URL: https://issues.apache.org/jira/browse/SPARK-28376
 Project: Spark
  Issue Type: New Feature
  Components: Input/Output, Spark Core
Affects Versions: 2.4.3
Reporter: t oo


this is for the ability to writeee parquet with sorteed values in each rowgroup

 

see 
[https://stackoverflow.com/questions/52159938/cant-write-ordered-data-to-parquet-in-spark]

[https://www.slideshare.net/RyanBlue3/parquet-performance-tuning-the-missing-guide]
 (slidee 26-27)

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15420) Repartition and sort before Parquet writes

2019-07-12 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884220#comment-16884220
 ] 

t oo commented on SPARK-15420:
--

PR was abandoned :(

> Repartition and sort before Parquet writes
> --
>
> Key: SPARK-15420
> URL: https://issues.apache.org/jira/browse/SPARK-15420
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.1
>Reporter: Ryan Blue
>Priority: Major
>
> Parquet requires buffering data in memory before writing a group of rows 
> organized by column. This causes significant memory pressure when writing 
> partitioned output because each open file must buffer rows.
> Currently, Spark will sort data and spill if necessary in the 
> {{WriterContainer}} to avoid keeping many files open at once. But, this isn't 
> a full solution for a few reasons:
> * The final sort is always performed, even if incoming data is already sorted 
> correctly. For example, a global sort will cause two sorts to happen, even if 
> the global sort correctly prepares the data.
> * To prevent a large number of output small output files, users must manually 
> add a repartition step. That step is also ignored by the sort within the 
> writer.
> * Hive does not currently support {{DataFrameWriter#sortBy}}
> The sort in {{WriterContainer}} makes sense to prevent problems, but should 
> detect if the incoming data is already sorted. The {{DataFrameWriter}} should 
> also expose the ability to repartition data before the write stage, and the 
> query planner should expose an option to automatically insert repartition 
> operations.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2019-06-12 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-27750:
-
Description: 
If I submit 1000 spark submit drivers then they consume all the cores on my 
cluster (essentially it acts like a Denial of Service) and no spark 
'application' gets to run since the cores are all consumed by the 'drivers'. 
This feature is about having the ability to prioritize applications over 
drivers so that at least some 'applications' can start running. I guess it 
would be like: If (driver.state = 'submitted' and (exists some app.state = 
'submitted')) then set app.state = 'running'

if all apps have app.state = 'running' then set driver.state = 'submitted' 

 

Secondary to this, why must a driver consume a minimum of 1 entire core?

  was:
If I submit 1000 spark submit drivers then they consume all the cores on my 
cluster (essentially it acts like a Denial of Service) and no spark 
'application' gets to run since the cores are all consumed by the 'drivers'. 
This feature is about having the ability to prioritize applications over 
drivers so that at least some 'applications' can start running. I guess it 
would be like: If (driver.state = 'submitted' and (exists some app.state = 
'submitted')) then set app.state = 'running'

if all apps have app.state = 'running' then set driver.state = 'submitted' 


> Standalone scheduler - ability to prioritize applications over drivers, many 
> drivers act like Denial of Service
> ---
>
> Key: SPARK-27750
> URL: https://issues.apache.org/jira/browse/SPARK-27750
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 2.3.3, 2.4.3
>Reporter: t oo
>Priority: Minor
>
> If I submit 1000 spark submit drivers then they consume all the cores on my 
> cluster (essentially it acts like a Denial of Service) and no spark 
> 'application' gets to run since the cores are all consumed by the 'drivers'. 
> This feature is about having the ability to prioritize applications over 
> drivers so that at least some 'applications' can start running. I guess it 
> would be like: If (driver.state = 'submitted' and (exists some app.state = 
> 'submitted')) then set app.state = 'running'
> if all apps have app.state = 'running' then set driver.state = 'submitted' 
>  
> Secondary to this, why must a driver consume a minimum of 1 entire core?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-20202) Remove references to org.spark-project.hive

2019-06-04 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-20202:
-
Comment: was deleted

(was: gentle ping)

> Remove references to org.spark-project.hive
> ---
>
> Key: SPARK-20202
> URL: https://issues.apache.org/jira/browse/SPARK-20202
> Project: Spark
>  Issue Type: Bug
>  Components: Build, SQL
>Affects Versions: 1.6.4, 2.0.3, 2.1.1
>Reporter: Owen O'Malley
>Priority: Major
>
> Spark can't continue to depend on their fork of Hive and must move to 
> standard Hive versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-20202) Remove references to org.spark-project.hive

2019-06-04 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-20202:
-
Comment: was deleted

(was: bump)

> Remove references to org.spark-project.hive
> ---
>
> Key: SPARK-20202
> URL: https://issues.apache.org/jira/browse/SPARK-20202
> Project: Spark
>  Issue Type: Bug
>  Components: Build, SQL
>Affects Versions: 1.6.4, 2.0.3, 2.1.1
>Reporter: Owen O'Malley
>Priority: Major
>
> Spark can't continue to depend on their fork of Hive and must move to 
> standard Hive versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27822) Spark WebUi - for running applications have a drivername column

2019-05-24 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-27822:
-
Attachment: spark-ui-waiting-2.png

> Spark WebUi - for running applications have a drivername column
> ---
>
> Key: SPARK-27822
> URL: https://issues.apache.org/jira/browse/SPARK-27822
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Major
> Attachments: spark-ui-waiting-2.png
>
>
> Since every app has one driver the list of running apps could have an extra 
> column of driver name to show association between app and driver names



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27822) Spark WebUi - for running applications have a drivername column

2019-05-24 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847329#comment-16847329
 ] 

t oo commented on SPARK-27822:
--

[~hyukjin.kwon] attached

> Spark WebUi - for running applications have a drivername column
> ---
>
> Key: SPARK-27822
> URL: https://issues.apache.org/jira/browse/SPARK-27822
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Major
> Attachments: spark-ui-waiting-2.png
>
>
> Since every app has one driver the list of running apps could have an extra 
> column of driver name to show association between app and driver names



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27822) Spark WebUi - for running applications have a drivername column

2019-05-23 Thread t oo (JIRA)
t oo created SPARK-27822:


 Summary: Spark WebUi - for running applications have a drivername 
column
 Key: SPARK-27822
 URL: https://issues.apache.org/jira/browse/SPARK-27822
 Project: Spark
  Issue Type: Improvement
  Components: Web UI
Affects Versions: 2.3.3
Reporter: t oo


Since every app has one driver the list of running apps could have an extra 
column of driver name to show association between app and driver names



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27821) Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state

2019-05-23 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-27821:
-
Attachment: webui.png

> Spark WebUI - show numbers of drivers/apps in 
> waiting/submitted/killed/running state
> 
>
> Key: SPARK-27821
> URL: https://issues.apache.org/jira/browse/SPARK-27821
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Minor
> Attachments: webui.png
>
>
> The webui shows total number of apps/drivers in running/completed state. This 
> improvement is to show total number in following more fine-grained states: 
> waiting/submitted/killed/running/completed 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27821) Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state

2019-05-23 Thread t oo (JIRA)
t oo created SPARK-27821:


 Summary: Spark WebUI - show numbers of drivers/apps in 
waiting/submitted/killed/running state
 Key: SPARK-27821
 URL: https://issues.apache.org/jira/browse/SPARK-27821
 Project: Spark
  Issue Type: Improvement
  Components: Web UI
Affects Versions: 2.3.3
Reporter: t oo


The webui shows total number of apps/drivers in running/completed state. This 
improvement is to show total number in following more fine-grained states: 
waiting/submitted/killed/running/completed 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service

2019-05-16 Thread t oo (JIRA)
t oo created SPARK-27750:


 Summary: Standalone scheduler - ability to prioritize applications 
over drivers, many drivers act like Denial of Service
 Key: SPARK-27750
 URL: https://issues.apache.org/jira/browse/SPARK-27750
 Project: Spark
  Issue Type: New Feature
  Components: Scheduler
Affects Versions: 2.4.3, 2.3.3
Reporter: t oo


If I submit 1000 spark submit drivers then they consume all the cores on my 
cluster (essentially it acts like a Denial of Service) and no spark 
'application' gets to run since the cores are all consumed by the 'drivers'. 
This feature is about having the ability to prioritize applications over 
drivers so that at least some 'applications' can start running. I guess it 
would be like: If (driver.state = 'submitted' and (exists some app.state = 
'submitted')) then set app.state = 'running'

if all apps have app.state = 'running' then set driver.state = 'submitted' 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

2019-04-25 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825804#comment-16825804
 ] 

t oo commented on SPARK-27491:
--

my current workaround to get airflow to integrate with spark2.3 cluster is to 
use 2 versions of spark on the airflow machines (spark2.3 to submit and 
spark2.1 to poll status) :) had to overwrite airflow's spark_submit_operator to 
point to 2 paths of spark

> SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty 
> response! therefore Airflow won't integrate with Spark 2.3.x
> --
>
> Key: SPARK-27491
> URL: https://issues.apache.org/jira/browse/SPARK-27491
> Project: Spark
>  Issue Type: Bug
>  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark 
> Submit
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Major
>
> This issue must have been introduced after Spark 2.1.1 as it is working in 
> that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using 
> spark standalone mode if that makes a difference.
> See below spark 2.3.3 returns empty response while 2.1.1 returns a response.
>  
> Spark 2.1.1:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + export SPARK_HOME=/home/ec2here/spark_home1
> + SPARK_HOME=/home/ec2here/spark_home1
> + '[' -z /home/ec2here/spark_home1 ']'
> + . /home/ec2here/spark_home1/bin/load-spark-env.sh
> ++ '[' -z /home/ec2here/spark_home1 ']'
> ++ '[' -z '' ']'
> ++ export SPARK_ENV_LOADED=1
> ++ SPARK_ENV_LOADED=1
> ++ parent_dir=/home/ec2here/spark_home1
> ++ user_conf_dir=/home/ec2here/spark_home1/conf
> ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
> ++ set -a
> ++ . /home/ec2here/spark_home1/conf/spark-env.sh
> +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
> +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
>  ulimit -n 1048576
> ++ set +a
> ++ '[' -z '' ']'
> ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
> ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
> ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
> ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
> ++ export SPARK_SCALA_VERSION=2.10
> ++ SPARK_SCALA_VERSION=2.10
> + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
> + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
> + '[' -d /home/ec2here/spark_home1/jars ']'
> + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
> + '[' '!' -d /home/ec2here/spark_home1/jars ']'
> + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
> + '[' -n '' ']'
> + [[ -n '' ]]
> + CMD=()
> + IFS=
> + read -d '' -r ARG
> ++ build_command org.apache.spark.deploy.SparkSubmit --master 
> spark://domainhere:6066 --status driver-20190417130324-0009
> ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
> '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> ++ printf '%d\0' 0
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + COUNT=10
> + LAST=9
> + LAUNCHER_EXIT_CODE=0
> + [[ 0 =~ ^[0-9]+$ ]]
> + '[' 0 '!=' 0 ']'
> + CMD=("${CMD[@]:0:$LAST}")
> + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
> '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
> status of submission driver-20190417130324-0009 in spark://domainhere:6066.
> 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
> SubmissionStatusResponse:
> {
>  "action" : "SubmissionStatusResponse",
>  "driverState" : "FAILED",
>  "serverSparkVersion" : "2.3.3",
>  "submissionId" : "driver-20190417130324-0009",
>  "success" : true,
>  "workerHostPort" : "x.y.211.40:11819",
>  "workerId" : "worker-20190417115840-x.y.211.40-11819"
> }
> [ec2here@ip-x-y-160-225 ~]$
>  
> Spark 2.3.3:
> [ec2here@ip-x-y-160-225 ~]$ bash -x 

[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

2019-04-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820542#comment-16820542
 ] 

t oo commented on SPARK-27491:
--

cc: [~skonto] [~mpmolek] [~gschiavon] [~scrapco...@gmail.com]

> SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty 
> response! therefore Airflow won't integrate with Spark 2.3.x
> --
>
> Key: SPARK-27491
> URL: https://issues.apache.org/jira/browse/SPARK-27491
> Project: Spark
>  Issue Type: Bug
>  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark 
> Submit
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Blocker
>
> This issue must have been introduced after Spark 2.1.1 as it is working in 
> that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using 
> spark standalone mode if that makes a difference.
> See below spark 2.3.3 returns empty response while 2.1.1 returns a response.
>  
> Spark 2.1.1:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + export SPARK_HOME=/home/ec2here/spark_home1
> + SPARK_HOME=/home/ec2here/spark_home1
> + '[' -z /home/ec2here/spark_home1 ']'
> + . /home/ec2here/spark_home1/bin/load-spark-env.sh
> ++ '[' -z /home/ec2here/spark_home1 ']'
> ++ '[' -z '' ']'
> ++ export SPARK_ENV_LOADED=1
> ++ SPARK_ENV_LOADED=1
> ++ parent_dir=/home/ec2here/spark_home1
> ++ user_conf_dir=/home/ec2here/spark_home1/conf
> ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
> ++ set -a
> ++ . /home/ec2here/spark_home1/conf/spark-env.sh
> +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
> +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
>  ulimit -n 1048576
> ++ set +a
> ++ '[' -z '' ']'
> ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
> ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
> ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
> ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
> ++ export SPARK_SCALA_VERSION=2.10
> ++ SPARK_SCALA_VERSION=2.10
> + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
> + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
> + '[' -d /home/ec2here/spark_home1/jars ']'
> + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
> + '[' '!' -d /home/ec2here/spark_home1/jars ']'
> + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
> + '[' -n '' ']'
> + [[ -n '' ]]
> + CMD=()
> + IFS=
> + read -d '' -r ARG
> ++ build_command org.apache.spark.deploy.SparkSubmit --master 
> spark://domainhere:6066 --status driver-20190417130324-0009
> ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
> '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> ++ printf '%d\0' 0
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + COUNT=10
> + LAST=9
> + LAUNCHER_EXIT_CODE=0
> + [[ 0 =~ ^[0-9]+$ ]]
> + '[' 0 '!=' 0 ']'
> + CMD=("${CMD[@]:0:$LAST}")
> + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
> '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
> status of submission driver-20190417130324-0009 in spark://domainhere:6066.
> 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
> SubmissionStatusResponse:
> {
>  "action" : "SubmissionStatusResponse",
>  "driverState" : "FAILED",
>  "serverSparkVersion" : "2.3.3",
>  "submissionId" : "driver-20190417130324-0009",
>  "success" : true,
>  "workerHostPort" : "x.y.211.40:11819",
>  "workerId" : "worker-20190417115840-x.y.211.40-11819"
> }
> [ec2here@ip-x-y-160-225 ~]$
>  
> Spark 2.3.3:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + '[' -z '' ']'
> ++ dirname 

[jira] [Updated] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

2019-04-17 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-27491:
-
Summary: SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" 
returns empty response! therefore Airflow won't integrate with Spark 2.3.x  
(was: SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns 
empty response! therefore Airflow won't integrate with Spark)

> SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty 
> response! therefore Airflow won't integrate with Spark 2.3.x
> --
>
> Key: SPARK-27491
> URL: https://issues.apache.org/jira/browse/SPARK-27491
> Project: Spark
>  Issue Type: Bug
>  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark 
> Submit
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Blocker
>
> This issue must have been introduced after Spark 2.1.1 as it is working in 
> that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using 
> spark standalone mode if that makes a difference.
> See below spark 2.3.3 returns empty response while 2.1.1 returns a response.
>  
> Spark 2.1.1:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + export SPARK_HOME=/home/ec2here/spark_home1
> + SPARK_HOME=/home/ec2here/spark_home1
> + '[' -z /home/ec2here/spark_home1 ']'
> + . /home/ec2here/spark_home1/bin/load-spark-env.sh
> ++ '[' -z /home/ec2here/spark_home1 ']'
> ++ '[' -z '' ']'
> ++ export SPARK_ENV_LOADED=1
> ++ SPARK_ENV_LOADED=1
> ++ parent_dir=/home/ec2here/spark_home1
> ++ user_conf_dir=/home/ec2here/spark_home1/conf
> ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
> ++ set -a
> ++ . /home/ec2here/spark_home1/conf/spark-env.sh
> +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
> +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
>  ulimit -n 1048576
> ++ set +a
> ++ '[' -z '' ']'
> ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
> ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
> ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
> ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
> ++ export SPARK_SCALA_VERSION=2.10
> ++ SPARK_SCALA_VERSION=2.10
> + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
> + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
> + '[' -d /home/ec2here/spark_home1/jars ']'
> + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
> + '[' '!' -d /home/ec2here/spark_home1/jars ']'
> + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
> + '[' -n '' ']'
> + [[ -n '' ]]
> + CMD=()
> + IFS=
> + read -d '' -r ARG
> ++ build_command org.apache.spark.deploy.SparkSubmit --master 
> spark://domainhere:6066 --status driver-20190417130324-0009
> ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
> '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> ++ printf '%d\0' 0
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + COUNT=10
> + LAST=9
> + LAUNCHER_EXIT_CODE=0
> + [[ 0 =~ ^[0-9]+$ ]]
> + '[' 0 '!=' 0 ']'
> + CMD=("${CMD[@]:0:$LAST}")
> + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
> '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
> status of submission driver-20190417130324-0009 in spark://domainhere:6066.
> 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
> SubmissionStatusResponse:
> {
>  "action" : "SubmissionStatusResponse",
>  "driverState" : "FAILED",
>  "serverSparkVersion" : "2.3.3",
>  "submissionId" : "driver-20190417130324-0009",
>  "success" : true,
>  "workerHostPort" : "x.y.211.40:11819",
>  "workerId" : "worker-20190417115840-x.y.211.40-11819"
> }
> [ec2here@ip-x-y-160-225 ~]$
>  
> Spark 2.3.3:
> [ec2here@ip-x-y-160-225 ~]$ bash -x 

[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x

2019-04-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820211#comment-16820211
 ] 

t oo commented on SPARK-27491:
--

cc: [~ash] [~bolke]

> SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty 
> response! therefore Airflow won't integrate with Spark 2.3.x
> --
>
> Key: SPARK-27491
> URL: https://issues.apache.org/jira/browse/SPARK-27491
> Project: Spark
>  Issue Type: Bug
>  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark 
> Submit
>Affects Versions: 2.3.3
>Reporter: t oo
>Priority: Blocker
>
> This issue must have been introduced after Spark 2.1.1 as it is working in 
> that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using 
> spark standalone mode if that makes a difference.
> See below spark 2.3.3 returns empty response while 2.1.1 returns a response.
>  
> Spark 2.1.1:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + export SPARK_HOME=/home/ec2here/spark_home1
> + SPARK_HOME=/home/ec2here/spark_home1
> + '[' -z /home/ec2here/spark_home1 ']'
> + . /home/ec2here/spark_home1/bin/load-spark-env.sh
> ++ '[' -z /home/ec2here/spark_home1 ']'
> ++ '[' -z '' ']'
> ++ export SPARK_ENV_LOADED=1
> ++ SPARK_ENV_LOADED=1
> ++ parent_dir=/home/ec2here/spark_home1
> ++ user_conf_dir=/home/ec2here/spark_home1/conf
> ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
> ++ set -a
> ++ . /home/ec2here/spark_home1/conf/spark-env.sh
> +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
> +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
>  ulimit -n 1048576
> ++ set +a
> ++ '[' -z '' ']'
> ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
> ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
> ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
> ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
> ++ export SPARK_SCALA_VERSION=2.10
> ++ SPARK_SCALA_VERSION=2.10
> + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
> + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
> + '[' -d /home/ec2here/spark_home1/jars ']'
> + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
> + '[' '!' -d /home/ec2here/spark_home1/jars ']'
> + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
> + '[' -n '' ']'
> + [[ -n '' ]]
> + CMD=()
> + IFS=
> + read -d '' -r ARG
> ++ build_command org.apache.spark.deploy.SparkSubmit --master 
> spark://domainhere:6066 --status driver-20190417130324-0009
> ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
> '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> ++ printf '%d\0' 0
> + CMD+=("$ARG")
> + IFS=
> + read -d '' -r ARG
> + COUNT=10
> + LAST=9
> + LAUNCHER_EXIT_CODE=0
> + [[ 0 =~ ^[0-9]+$ ]]
> + '[' 0 '!=' 0 ']'
> + CMD=("${CMD[@]:0:$LAST}")
> + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
> '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
> status of submission driver-20190417130324-0009 in spark://domainhere:6066.
> 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
> SubmissionStatusResponse:
> {
>  "action" : "SubmissionStatusResponse",
>  "driverState" : "FAILED",
>  "serverSparkVersion" : "2.3.3",
>  "submissionId" : "driver-20190417130324-0009",
>  "success" : true,
>  "workerHostPort" : "x.y.211.40:11819",
>  "workerId" : "worker-20190417115840-x.y.211.40-11819"
> }
> [ec2here@ip-x-y-160-225 ~]$
>  
> Spark 2.3.3:
> [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class 
> org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
> driver-20190417130324-0009
> + '[' -z '' ']'
> ++ dirname /home/ec2here/spark_home/bin/spark-class
> + source 

[jira] [Created] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark

2019-04-17 Thread t oo (JIRA)
t oo created SPARK-27491:


 Summary: SPARK REST API - "org.apache.spark.deploy.SparkSubmit 
--status" returns empty response! therefore Airflow won't integrate with Spark
 Key: SPARK-27491
 URL: https://issues.apache.org/jira/browse/SPARK-27491
 Project: Spark
  Issue Type: Bug
  Components: Java API, Scheduler, Spark Core, Spark Shell, Spark Submit
Affects Versions: 2.3.3
Reporter: t oo


This issue must have been introduced after Spark 2.1.1 as it is working in that 
version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using spark 
standalone mode if that makes a difference.

See below spark 2.3.3 returns empty response while 2.1.1 returns a response.

 

Spark 2.1.1:

[ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class 
org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
driver-20190417130324-0009
+ export SPARK_HOME=/home/ec2here/spark_home1
+ SPARK_HOME=/home/ec2here/spark_home1
+ '[' -z /home/ec2here/spark_home1 ']'
+ . /home/ec2here/spark_home1/bin/load-spark-env.sh
++ '[' -z /home/ec2here/spark_home1 ']'
++ '[' -z '' ']'
++ export SPARK_ENV_LOADED=1
++ SPARK_ENV_LOADED=1
++ parent_dir=/home/ec2here/spark_home1
++ user_conf_dir=/home/ec2here/spark_home1/conf
++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']'
++ set -a
++ . /home/ec2here/spark_home1/conf/spark-env.sh
+++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
+++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64
 ulimit -n 1048576
++ set +a
++ '[' -z '' ']'
++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11
++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10
++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]]
++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']'
++ export SPARK_SCALA_VERSION=2.10
++ SPARK_SCALA_VERSION=2.10
+ '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']'
+ RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java
+ '[' -d /home/ec2here/spark_home1/jars ']'
+ SPARK_JARS_DIR=/home/ec2here/spark_home1/jars
+ '[' '!' -d /home/ec2here/spark_home1/jars ']'
+ LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*'
+ '[' -n '' ']'
+ [[ -n '' ]]
+ CMD=()
+ IFS=
+ read -d '' -r ARG
++ build_command org.apache.spark.deploy.SparkSubmit --master 
spark://domainhere:6066 --status driver-20190417130324-0009
++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp 
'/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main 
org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
driver-20190417130324-0009
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
++ printf '%d\0' 0
+ CMD+=("$ARG")
+ IFS=
+ read -d '' -r ARG
+ COUNT=10
+ LAST=9
+ LAUNCHER_EXIT_CODE=0
+ [[ 0 =~ ^[0-9]+$ ]]
+ '[' 0 '!=' 0 ']'
+ CMD=("${CMD[@]:0:$LAST}")
+ exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp 
'/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m 
org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
driver-20190417130324-0009
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the 
status of submission driver-20190417130324-0009 in spark://domainhere:6066.
19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with 
SubmissionStatusResponse:
{
 "action" : "SubmissionStatusResponse",
 "driverState" : "FAILED",
 "serverSparkVersion" : "2.3.3",
 "submissionId" : "driver-20190417130324-0009",
 "success" : true,
 "workerHostPort" : "x.y.211.40:11819",
 "workerId" : "worker-20190417115840-x.y.211.40-11819"
}
[ec2here@ip-x-y-160-225 ~]$

 


Spark 2.3.3:


[ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class 
org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status 
driver-20190417130324-0009
+ '[' -z '' ']'
++ dirname /home/ec2here/spark_home/bin/spark-class
+ source /home/ec2here/spark_home/bin/find-spark-home
 dirname /home/ec2here/spark_home/bin/spark-class
+++ cd /home/ec2here/spark_home/bin
+++ pwd
++ FIND_SPARK_HOME_PYTHON_SCRIPT=/home/ec2here/spark_home/bin/find_spark_home.py
++ '[' '!' -z '' ']'
++ '[' '!' -f /home/ec2here/spark_home/bin/find_spark_home.py ']'
 dirname /home/ec2here/spark_home/bin/spark-class
+++ cd /home/ec2here/spark_home/bin/..
+++ pwd
++ export SPARK_HOME=/home/ec2here/spark_home
++ SPARK_HOME=/home/ec2here/spark_home
+ . /home/ec2here/spark_home/bin/load-spark-env.sh
++ '[' -z /home/ec2here/spark_home ']'
++ 

[jira] [Commented] (SPARK-25088) Rest Server default & doc updates

2019-04-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819764#comment-16819764
 ] 

t oo commented on SPARK-25088:
--

why block rest if auth is on? for example i want to be able to use unauthed 
rest AND authed standard submission

> Rest Server default & doc updates
> -
>
> Key: SPARK-25088
> URL: https://issues.apache.org/jira/browse/SPARK-25088
> Project: Spark
>  Issue Type: Improvement
>  Components: Deploy, Spark Core
>Affects Versions: 2.1.3, 2.2.2, 2.3.1, 2.4.0
>Reporter: Imran Rashid
>Assignee: Imran Rashid
>Priority: Major
>  Labels: release-notes
> Fix For: 2.4.0
>
>
> The rest server could use some updates on defaults & docs, both in standalone 
> and mesos.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27429) [SQL] to_timestamp function with additional argument flag that will allow exception if value could not be cast

2019-04-10 Thread t oo (JIRA)
t oo created SPARK-27429:


 Summary: [SQL] to_timestamp function with additional argument flag 
that will allow exception if value could not be cast
 Key: SPARK-27429
 URL: https://issues.apache.org/jira/browse/SPARK-27429
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.1
Reporter: t oo


If I am running a SQL on a csv based dataframe and my query has  
to_timestamp(input_col,'-MM-dd HH:mm:ss'), if the values in input_col are 
not really timestamp like 'ABC' then I would like to_timestamp function to 
throw an exception rather than happily (silently) return the values as null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27284) Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs -applicationId)

2019-04-01 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806462#comment-16806462
 ] 

t oo commented on SPARK-27284:
--

[~srowen] in yarn mode if you want all logs for a single spark app you can see 
them in a single command (yarn -logs -applicationId). But in spark standalone 
the logs are separated under work/ ie spread across many files. 
This Jira is about a single command to get all executor+driver logs in 1 file

> Spark Standalone aggregated logs in 1 file per appid (ala  yarn -logs 
> -applicationId)
> -
>
> Key: SPARK-27284
> URL: https://issues.apache.org/jira/browse/SPARK-27284
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 2.3.3, 2.4.0
>Reporter: t oo
>Priority: Minor
>
> Feature: Spark Standalone aggregated logs in 1 file per appid (ala  yarn 
> -logs -applicationId)
>  
> This would be 1 single file per appid with contents of ALL the executors logs
> [https://stackoverflow.com/questions/46004528/how-can-i-see-the-aggregated-logs-for-a-spark-standalone-cluster]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27284) Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs -applicationId)

2019-03-26 Thread t oo (JIRA)
t oo created SPARK-27284:


 Summary: Spark Standalone aggregated logs in 1 file per appid (ala 
 yarn -logs -applicationId)
 Key: SPARK-27284
 URL: https://issues.apache.org/jira/browse/SPARK-27284
 Project: Spark
  Issue Type: New Feature
  Components: Scheduler
Affects Versions: 2.4.0, 2.3.3
Reporter: t oo


Feature: Spark Standalone aggregated logs in 1 file per appid (ala  yarn -logs 
-applicationId)

 

This would be 1 single file per appid with contents of ALL the executors logs

[https://stackoverflow.com/questions/46004528/how-can-i-see-the-aggregated-logs-for-a-spark-standalone-cluster]

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode

2019-03-05 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784791#comment-16784791
 ] 

t oo commented on SPARK-26998:
--

[~gsomogyi] please take it forward.

[~kabhwan] truststore password being shown is not much of a problem since 
truststore is often distributed to users anyway. But keystore password still 
being shown is the big no-no.

> spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor 
> processes in Standalone mode
> ---
>
> Key: SPARK-26998
> URL: https://issues.apache.org/jira/browse/SPARK-26998
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Security, Spark Core
>Affects Versions: 2.3.3, 2.4.0
>Reporter: t oo
>Priority: Major
>  Labels: SECURITY, Security, secur, security, security-issue
>
> Run spark standalone mode, then start a spark-submit requiring at least 1 
> executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to 
> see  spark.ssl.keyStorePassword value in plaintext!
>  
> spark.ssl.keyStorePassword and  spark.ssl.keyPassword don't need to be passed 
> to  CoarseGrainedExecutorBackend. Only  spark.ssl.trustStorePassword is used.
>  
> Can be resolved if below PR is merged:
> [[Github] Pull Request #21514 
> (tooptoop4)|https://github.com/apache/spark/pull/21514]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode

2019-03-02 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782330#comment-16782330
 ] 

t oo commented on SPARK-26998:
--

[https://github.com/apache/spark/pull/23820] is only about hiding password from 
log file, SPARK-26998 is about hiding passwords from showing in 'ps -ef' 
process list

> spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor 
> processes in Standalone mode
> ---
>
> Key: SPARK-26998
> URL: https://issues.apache.org/jira/browse/SPARK-26998
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Security, Spark Core
>Affects Versions: 2.3.3, 2.4.0
>Reporter: t oo
>Priority: Major
>  Labels: SECURITY, Security, secur, security, security-issue
>
> Run spark standalone mode, then start a spark-submit requiring at least 1 
> executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to 
> see  spark.ssl.keyStorePassword value in plaintext!
>  
> spark.ssl.keyStorePassword and  spark.ssl.keyPassword don't need to be passed 
> to  CoarseGrainedExecutorBackend. Only  spark.ssl.trustStorePassword is used.
>  
> Can be resolved if below PR is merged:
> [[Github] Pull Request #21514 
> (tooptoop4)|https://github.com/apache/spark/pull/21514]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode

2019-02-26 Thread t oo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

t oo updated SPARK-26998:
-
Labels: SECURITY Security secur security security-issue  (was: )

> spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor 
> processes in Standalone mode
> ---
>
> Key: SPARK-26998
> URL: https://issues.apache.org/jira/browse/SPARK-26998
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Security, Spark Core
>Affects Versions: 2.3.3, 2.4.0
>Reporter: t oo
>Priority: Major
>  Labels: SECURITY, Security, secur, security, security-issue
>
> Run spark standalone mode, then start a spark-submit requiring at least 1 
> executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to 
> see  spark.ssl.keyStorePassword value in plaintext!
>  
> spark.ssl.keyStorePassword and  spark.ssl.keyPassword don't need to be passed 
> to  CoarseGrainedExecutorBackend. Only  spark.ssl.trustStorePassword is used.
>  
> Can be resolved if below PR is merged:
> [[Github] Pull Request #21514 
> (tooptoop4)|https://github.com/apache/spark/pull/21514]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode

2019-02-26 Thread t oo (JIRA)
t oo created SPARK-26998:


 Summary: spark.ssl.keyStorePassword in plaintext on 'ps -ef' 
output of executor processes in Standalone mode
 Key: SPARK-26998
 URL: https://issues.apache.org/jira/browse/SPARK-26998
 Project: Spark
  Issue Type: Bug
  Components: Scheduler, Security, Spark Core
Affects Versions: 2.4.0, 2.3.3
Reporter: t oo


Run spark standalone mode, then start a spark-submit requiring at least 1 
executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to 
see  spark.ssl.keyStorePassword value in plaintext!

 

spark.ssl.keyStorePassword and  spark.ssl.keyPassword don't need to be passed 
to  CoarseGrainedExecutorBackend. Only  spark.ssl.trustStorePassword is used.

 

Can be resolved if below PR is merged:

[[Github] Pull Request #21514 
(tooptoop4)|https://github.com/apache/spark/pull/21514]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25757) Upgrade netty-all from 4.1.17.Final to 4.1.30.Final

2019-02-23 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775873#comment-16775873
 ] 

t oo commented on SPARK-25757:
--

[~lipzhu] want to upg netty to 4.1.33?

> Upgrade netty-all from 4.1.17.Final to 4.1.30.Final
> ---
>
> Key: SPARK-25757
> URL: https://issues.apache.org/jira/browse/SPARK-25757
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Zhu, Lipeng
>Assignee: Zhu, Lipeng
>Priority: Minor
> Fix For: 3.0.0
>
>
> Upgrade netty from 4.1.17.Final to 4.1.30.Final to fix some netty version 
> bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16859) History Server storage information is missing

2019-02-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770679#comment-16770679
 ] 

t oo commented on SPARK-16859:
--

I know how you feel, I submitted a PR to fix a security flaw (SPARK-22860) but 
the reviewers were more concerned about cosmetic aspects and did not merge it

> History Server storage information is missing
> -
>
> Key: SPARK-16859
> URL: https://issues.apache.org/jira/browse/SPARK-16859
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.2, 2.0.0
>Reporter: Andrei Ivanov
>Priority: Major
>  Labels: historyserver, newbie
>
> It looks like job history storage tab in history server is broken for 
> completed jobs since *1.6.2*. 
> More specifically it's broken since 
> [SPARK-13845|https://issues.apache.org/jira/browse/SPARK-13845].
> I've fixed for my installation by effectively reverting the above patch 
> ([see|https://github.com/EinsamHauer/spark/commit/3af62ea09af8bb350c8c8a9117149c09b8feba08]).
> IMHO, the most straightforward fix would be to implement 
> _SparkListenerBlockUpdated_ serialization to JSON in _JsonProtocol_ making 
> sure it works from _ReplayListenerBus_.
> The downside will be that it will still work incorrectly with pre patch job 
> histories. But then, it doesn't work since *1.6.2* anyhow.
> PS: I'd really love to have this fixed eventually. But I'm pretty new to 
> Apache Spark and missing hands on Scala experience. So  I'd prefer that it be 
> fixed by someone experienced with roadmap vision. If nobody volunteers I'll 
> try to patch myself.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16859) History Server storage information is missing

2019-02-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770482#comment-16770482
 ] 

t oo commented on SPARK-16859:
--

please don't leave us [~Hauer]

> History Server storage information is missing
> -
>
> Key: SPARK-16859
> URL: https://issues.apache.org/jira/browse/SPARK-16859
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.2, 2.0.0
>Reporter: Andrei Ivanov
>Priority: Major
>  Labels: historyserver, newbie
>
> It looks like job history storage tab in history server is broken for 
> completed jobs since *1.6.2*. 
> More specifically it's broken since 
> [SPARK-13845|https://issues.apache.org/jira/browse/SPARK-13845].
> I've fixed for my installation by effectively reverting the above patch 
> ([see|https://github.com/EinsamHauer/spark/commit/3af62ea09af8bb350c8c8a9117149c09b8feba08]).
> IMHO, the most straightforward fix would be to implement 
> _SparkListenerBlockUpdated_ serialization to JSON in _JsonProtocol_ making 
> sure it works from _ReplayListenerBus_.
> The downside will be that it will still work incorrectly with pre patch job 
> histories. But then, it doesn't work since *1.6.2* anyhow.
> PS: I'd really love to have this fixed eventually. But I'm pretty new to 
> Apache Spark and missing hands on Scala experience. So  I'd prefer that it be 
> fixed by someone experienced with roadmap vision. If nobody volunteers I'll 
> try to patch myself.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17556) Executor side broadcast for broadcast joins

2019-02-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770481#comment-16770481
 ] 

t oo commented on SPARK-17556:
--

please don't leave us [~scwf]

> Executor side broadcast for broadcast joins
> ---
>
> Key: SPARK-17556
> URL: https://issues.apache.org/jira/browse/SPARK-17556
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, SQL
>Reporter: Reynold Xin
>Priority: Major
> Attachments: executor broadcast.pdf, executor-side-broadcast.pdf
>
>
> Currently in Spark SQL, in order to perform a broadcast join, the driver must 
> collect the result of an RDD and then broadcast it. This introduces some 
> extra latency. It might be possible to broadcast directly from executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22860) Spark workers log ssl passwords passed to the executors

2019-02-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770373#comment-16770373
 ] 

t oo commented on SPARK-22860:
--

Not just in worker log, but in 'ps -ef' process list

> Spark workers log ssl passwords passed to the executors
> ---
>
> Key: SPARK-22860
> URL: https://issues.apache.org/jira/browse/SPARK-22860
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Felix K.
>Priority: Major
>
> The workers log the spark.ssl.keyStorePassword and 
> spark.ssl.trustStorePassword passed by cli to the executor processes. The 
> ExecutorRunner should escape passwords to not appear in the worker's log 
> files in INFO level. In this example, you can see my 'SuperSecretPassword' in 
> a worker log:
> {code}
> 17/12/08 08:04:12 INFO ExecutorRunner: Launch command: 
> "/global/myapp/oem/jdk/bin/java" "-cp" 
> "/global/myapp/application/myapp_software/thing_loader_lib/core-repository-model-zzz-1.2.3-SNAPSHOT.jar
> [...]
> :/global/myapp/application/spark-2.1.1-bin-hadoop2.7/jars/*" "-Xmx16384M" 
> "-Dspark.authenticate.enableSaslEncryption=true" 
> "-Dspark.ssl.keyStorePassword=SuperSecretPassword" 
> "-Dspark.ssl.keyStore=/global/myapp/application/config/ssl/keystore.jks" 
> "-Dspark.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks" 
> "-Dspark.ssl.enabled=true" "-Dspark.driver.port=39927" 
> "-Dspark.ssl.protocol=TLS" 
> "-Dspark.ssl.trustStorePassword=SuperSecretPassword" 
> "-Dspark.authenticate=true" "-Dmyapp_IMPORT_DATE=2017-10-30" 
> "-Dmyapp.config.directory=/global/myapp/application/config" 
> "-Dsolr.httpclient.builder.factory=com.company.myapp.loader.auth.LoaderConfigSparkSolrBasicAuthConfigurer"
>  
> "-Djavax.net.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks"
>  "-XX:+UseG1GC" "-XX:+UseStringDeduplication" 
> "-Dthings.loader.export.zzz_files=false" 
> "-Dlog4j.configuration=file:/global/myapp/application/config/spark-executor-log4j.properties"
>  "-XX:+HeapDumpOnOutOfMemoryError" "-XX:+UseStringDeduplication" 
> "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" 
> "spark://CoarseGrainedScheduler@192.168.0.1:39927" "--executor-id" "2" 
> "--hostname" "192.168.0.1" "--cores" "4" "--app-id" "app-20171208080412-" 
> "--worker-url" "spark://Worker@192.168.0.1:59530"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-20286) dynamicAllocation.executorIdleTimeout is ignored after unpersist

2019-02-17 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-20286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770344#comment-16770344
 ] 

t oo commented on SPARK-20286:
--

gentle ping

> dynamicAllocation.executorIdleTimeout is ignored after unpersist
> 
>
> Key: SPARK-20286
> URL: https://issues.apache.org/jira/browse/SPARK-20286
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.0.1
>Reporter: Miguel Pérez
>Priority: Major
>
> With dynamic allocation enabled, it seems that executors with cached data 
> which are unpersisted are still being killed using the 
> {{dynamicAllocation.cachedExecutorIdleTimeout}} configuration, instead of 
> {{dynamicAllocation.executorIdleTimeout}}. Assuming the default configuration 
> ({{dynamicAllocation.cachedExecutorIdleTimeout = Infinity}}), an executor 
> with unpersisted data won't be released until the job ends.
> *How to reproduce*
> - Set different values for {{dynamicAllocation.executorIdleTimeout}} and 
> {{dynamicAllocation.cachedExecutorIdleTimeout}}
> - Load a file into a RDD and persist it
> - Execute an action on the RDD (like a count) so some executors are activated.
> - When the action has finished, unpersist the RDD
> - The application UI removes correctly the persisted data from the *Storage* 
> tab, but if you look in the *Executors* tab, you will find that the executors 
> remain *active* until ({{dynamicAllocation.cachedExecutorIdleTimeout}} is 
> reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25594) OOM in long running applications even with UI disabled

2019-02-16 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770230#comment-16770230
 ] 

t oo commented on SPARK-25594:
--

please merge

> OOM in long running applications even with UI disabled
> --
>
> Key: SPARK-25594
> URL: https://issues.apache.org/jira/browse/SPARK-25594
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.4.0
>Reporter: Mridul Muralidharan
>Assignee: Mridul Muralidharan
>Priority: Major
>
> Typically for long running applications with large number of tasks it is 
> common to disable UI to minimize overhead at driver.
> Earlier, with spark ui disabled, only stage/job information was kept as part 
> of JobProgressListener.
> As part of history server scalability fixes, particularly SPARK-20643, 
> inspite of disabling UI - task information continues to be maintained in 
> memory.
> In our long running tests against spark thrift server, this eventually 
> results in OOM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15046) When running hive-thriftserver with yarn on a secure cluster the workers fail with java.lang.NumberFormatException

2019-02-15 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769890#comment-16769890
 ] 

t oo commented on SPARK-15046:
--

gentle ping

> When running hive-thriftserver with yarn on a secure cluster the workers fail 
> with java.lang.NumberFormatException
> --
>
> Key: SPARK-15046
> URL: https://issues.apache.org/jira/browse/SPARK-15046
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.0.0
>Reporter: Trystan Leftwich
>Assignee: Marcelo Vanzin
>Priority: Blocker
> Fix For: 2.0.0
>
>
> When running hive-thriftserver with yarn on a secure cluster 
> (spark.yarn.principal and spark.yarn.keytab are set) the workers fail with 
> the following error.
> {code}
> 16/04/30 22:40:50 ERROR yarn.ApplicationMaster: Uncaught exception: 
> java.lang.NumberFormatException: For input string: "86400079ms"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Long.parseLong(Long.java:441)
>   at java.lang.Long.parseLong(Long.java:483)
>   at 
> scala.collection.immutable.StringLike$class.toLong(StringLike.scala:276)
>   at scala.collection.immutable.StringOps.toLong(StringOps.scala:29)
>   at 
> org.apache.spark.SparkConf$$anonfun$getLong$2.apply(SparkConf.scala:380)
>   at 
> org.apache.spark.SparkConf$$anonfun$getLong$2.apply(SparkConf.scala:380)
>   at scala.Option.map(Option.scala:146)
>   at org.apache.spark.SparkConf.getLong(SparkConf.scala:380)
>   at 
> org.apache.spark.deploy.SparkHadoopUtil.getTimeFromNowToRenewal(SparkHadoopUtil.scala:289)
>   at 
> org.apache.spark.deploy.yarn.AMDelegationTokenRenewer.org$apache$spark$deploy$yarn$AMDelegationTokenRenewer$$scheduleRenewal$1(AMDelegationTokenRenewer.scala:89)
>   at 
> org.apache.spark.deploy.yarn.AMDelegationTokenRenewer.scheduleLoginFromKeytab(AMDelegationTokenRenewer.scala:121)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$3.apply(ApplicationMaster.scala:243)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$3.apply(ApplicationMaster.scala:243)
>   at scala.Option.foreach(Option.scala:257)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:243)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:723)
>   at 
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
>   at 
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:721)
>   at 
> org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:748)
>   at 
> org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22374) STS ran into OOM in a secure cluster

2019-02-15 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769885#comment-16769885
 ] 

t oo commented on SPARK-22374:
--

gentle ping

> STS ran into OOM in a secure cluster
> 
>
> Key: SPARK-22374
> URL: https://issues.apache.org/jira/browse/SPARK-22374
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Dongjoon Hyun
>Priority: Major
> Attachments: 1.png, 2.png, 3.png
>
>
> In a secure cluster, FileSystem.CACHE grows indefinitely.
> *ENVIRONMENT*
> 1. `spark.yarn.principal` and `spark.yarn.keytab` is used.
> 2. Spark Thrift Server run with `doAs` false.
> {code}
> 
>   hive.server2.enable.doAs
>   false
> 
> {code}
> With 6GB (-Xmx6144m) options, `HiveConf` consumes 4GB inside FileSystem.CACHE.
> {code}
> 20,030 instances of "org.apache.hadoop.hive.conf.HiveConf", loaded by 
> "sun.misc.Launcher$AppClassLoader @ 0x64001c160" occupy 4,418,101,352 
> (73.42%) bytes. These instances are referenced from one instance of 
> "java.util.HashMap$Node[]", loaded by ""
> {code}
> Please see the attached images.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   3   >