[jira] [Created] (SPARK-44025) CSV Table Read Error with CharType(length) column

2023-06-11 Thread Fengyu Cao (Jira)
Fengyu Cao created SPARK-44025:
--

 Summary: CSV Table Read Error with CharType(length) column
 Key: SPARK-44025
 URL: https://issues.apache.org/jira/browse/SPARK-44025
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.4.0
 Environment: {{apache/spark:v3.4.0 image}}
Reporter: Fengyu Cao


Problem:
 # read a CSV format table
 # table has a `CharType(length)` column
 # read table failed with Exception:  `org.apache.spark.SparkException: Job 
aborted due to stage failure: Task 0 in stage 36.0 failed 4 times, most recent 
failure: Lost task 0.3 in stage 36.0 (TID 72) (10.113.9.208 executor 11): 
java.lang.IllegalArgumentException: requirement failed: requiredSchema 
(struct) should be the subset of dataSchema 
(struct).`

 

reproduce with official image:
 # {{docker run -it apache/spark:v3.4.0 /opt/spark/bin/spark-sql}}
 # {{CREATE TABLE csv_bug (name STRING, age INT, job CHAR(4)) USING CSV OPTIONS 
('header' = 'true', 'sep' = ';') LOCATION 
"/opt/spark/examples/src/main/resources/people.csv";}}
 # SELECT * FROM csv_bug;
 # ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: requiredSchema 
(struct) should be the subset of dataSchema 
(struct).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39763) Executor memory footprint substantially increases while reading zstd compressed parquet files

2022-09-09 Thread Fengyu Cao (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17602154#comment-17602154
 ] 

Fengyu Cao commented on SPARK-39763:


[https://github.com/apache/parquet-mr/pull/982]

> Executor memory footprint substantially increases while reading zstd 
> compressed parquet files
> -
>
> Key: SPARK-39763
> URL: https://issues.apache.org/jira/browse/SPARK-39763
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: Yeachan Park
>Priority: Minor
>
> Hi all,
>  
> While transitioning from the default snappy compression to zstd, we noticed a 
> substantial increase in executor memory whilst *reading* and applying 
> transformations on *zstd* compressed parquet files.
> Memory footprint increased increased 3 fold in some cases, compared to 
> reading and applying the same transformations on a parquet file compressed 
> with snappy.
> This behaviour only occurs when reading zstd compressed parquet files. 
> Writing a zstd parquet file does not result in this behaviour.
> To reproduce:
>  # Set "spark.sql.parquet.compression.codec" to zstd
>  # Write some parquet files, the compression will default to zstd after 
> setting the option above
>  # Read the compressed zstd file and run some transformations. Compare the 
> memory usage of the executor vs running the same transformation on a parquet 
> file with snappy compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39763) Executor memory footprint substantially increases while reading zstd compressed parquet files

2022-08-30 Thread Fengyu Cao (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597685#comment-17597685
 ] 

Fengyu Cao commented on SPARK-39763:


had the same problem

 

one of our dataset, 75GB in zstd parquet(134G in snappy)
{code:java}
# 10 executor
# Executor Reqs: memoryOverhead: [amount: 3072] cores: [amount: 4] memory: 
[amount: 10240] offHeap: [amount: 4096] Task Reqs: cpus: [amount: 1.0]
 
df = spark.read.parquet("dataset_zstd")  # with 
spark.sql.parquet.enableVectorizedReader=false
df.write.mode("overwrite").format("noop").save()
{code}
task failed with OOM, but with dataset in snappy, everything is fine

 

> Executor memory footprint substantially increases while reading zstd 
> compressed parquet files
> -
>
> Key: SPARK-39763
> URL: https://issues.apache.org/jira/browse/SPARK-39763
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: Yeachan Park
>Priority: Minor
>
> Hi all,
>  
> While transitioning from the default snappy compression to zstd, we noticed a 
> substantial increase in executor memory whilst *reading* and applying 
> transformations on *zstd* compressed parquet files.
> Memory footprint increased increased 3 fold in some cases, compared to 
> reading and applying the same transformations on a parquet file compressed 
> with snappy.
> This behaviour only occurs when reading zstd compressed parquet files. 
> Writing a zstd parquet file does not result in this behaviour.
> To reproduce:
>  # Set "spark.sql.parquet.compression.codec" to zstd
>  # Write some parquet files, the compression will default to zstd after 
> setting the option above
>  # Read the compressed zstd file and run some transformations. Compare the 
> memory usage of the executor vs running the same transformation on a parquet 
> file with snappy compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-34519) ExecutorPodsAllocator use exponential backoff strategy when request executor pod failed

2021-02-24 Thread Fengyu Cao (Jira)
Fengyu Cao created SPARK-34519:
--

 Summary: ExecutorPodsAllocator use exponential backoff strategy 
when request executor pod failed
 Key: SPARK-34519
 URL: https://issues.apache.org/jira/browse/SPARK-34519
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.0.1
 Environment: spark 3.0.1

kubernetes 1.18.8
Reporter: Fengyu Cao


# create a resouce quota `kubectl create quota test --hard=cpu=20,memory=60G`
 # submit an application request more than quota `spark-submit --executor-cores 
5 --executor-memory 10G --num-executors 10 `
 # seems `ExecutorPodsAllocator: Going to request 5 executors from Kubernetes.` 
print every second

`spark.kubernetes.allocation.batch.delay` default is 1s, which good enough when 
allocation succeeded, but exponential backoff maybe an better choice when 
alloction failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-23 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750667#comment-16750667
 ] 

Fengyu Cao commented on SPARK-26389:


 a force clean-up flag maybe help if not use hdfs

the size of temp checkpints out of hdfs is not acceptable for me:
 # nginx log group by uid (1h window, 5min slide window)
 # run about 4 hours on 2 executor hosts (default trigger)
 # more than 1gb on each host

seems hdfs state store clean logic not work well on non-hdfs file system(xfs)

 

thanks anyway, shoud I close this issue or change type/prioriy?

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-08 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737776#comment-16737776
 ] 

Fengyu Cao commented on SPARK-26389:


hmmm, we didn't configure HDFS

after HDFS configured, temp checkpoint store at 
hdfs://cluster/tmp/temporary- instead of executor host's 
/tmp/temporary-

so, the problem only affects when HDFS is not configured (temp checkpoint store 
at executor's java.io.tmpdir)

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-03 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733815#comment-16733815
 ] 

Fengyu Cao edited comment on SPARK-26389 at 1/4/19 4:03 AM:


{quote}Temp checkpoint can be used in one-node scenario and deleted only if the 
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp 
checkpoint if we run a framework non-local

And if we do this(run non-local with temp checkpoint), the checkpoint dir on 
executor consume lots of space and not be deleted if the query fails, and this 
checkpoint can't be used to recover as I mentioned above.

I just think that spark either should prohibits users from using temp 
checkpoints when their frameworks are non-local, or should be responsible for 
cleaning up this useless checkpoint directory even if the query fails.

 

 


was (Author: camper42):
{quote}Temp checkpoint can be used in one-node scenario and deleted only if the 
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp 
checkpoint if we run a framework non-local

And if we do this(run non-local with temp checkpoint), the checkpoint dir on 
executor consume lots of space and not be deleted if the query if fail, and 
this checkpoint can't be used to recover as I mentioned above.

I just think that spark either should prohibits users from using temp 
checkpoints when their frameworks are non-local, or should be responsible for 
cleaning up this useless checkpoint directory even if the query fails.

 

 

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-03 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733815#comment-16733815
 ] 

Fengyu Cao commented on SPARK-26389:


{quote}Temp checkpoint can be used in one-node scenario and deleted only if the 
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp 
checkpoint if we run a framework non-local

And if we do this(run non-local with temp checkpoint), the checkpoint dir on 
executor consume lots of space and not be deleted if the query if fail, and 
this checkpoint can't be used to recover as I mentioned above.

I just think that spark either should prohibits users from using temp 
checkpoints when their frameworks are non-local, or should be responsible for 
cleaning up this useless checkpoint directory even if the query fails.

 

 

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-19 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725533#comment-16725533
 ] 

Fengyu Cao commented on SPARK-26389:


console output force use temp checkpoint (I just want to test my code)

and there is no way to disable checkpoint

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-19 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725528#comment-16725528
 ] 

Fengyu Cao edited comment on SPARK-26389 at 12/20/18 2:45 AM:
--

thanks for reply

Two scenarios:
 # {{temp checkpoint dir /tmp/temporary- on worker node}}
 # framework restart
 # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- 
can't used to recovery and should be deleted)}}

 
 # {{temp checkpoint dir /tmp/temporary- on worker node}}
 # executor stop in some reason
 # executor start on another worker nodes (/tmp/temporary- can't used to 
recovery either)

 

Maybe temp checkpoint dir should be deleted on JVM stop?

 

 
{quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not 
Structured Streaming one.
{quote}
 

sorry, I didn't notice this.


was (Author: camper42):
thanks for reply

Two scenarios:
 # {{temp checkpoint dir /tmp/temporary-}}
 # framework restart
 # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- 
can't used to recovery and should be deleted)}}

 
 # {{temp checkpoint dir /tmp/temporary-}}
 # executor stop in some reason
 # executor start on another worker nodes (/tmp/temporary- can't used to 
recovery either)

 

Maybe temp checkpoint dir should be deleted on JVM stop?

 

 
{quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not 
Structured Streaming one.
{quote}
 

sorry, I didn't notice this.

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-19 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725528#comment-16725528
 ] 

Fengyu Cao edited comment on SPARK-26389 at 12/20/18 2:44 AM:
--

thanks for reply

Two scenarios:
 # {{temp checkpoint dir /tmp/temporary-}}
 # framework restart
 # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- 
can't used to recovery and should be deleted)}}

 
 # {{temp checkpoint dir /tmp/temporary-}}
 # executor stop in some reason
 # executor start on another worker nodes (/tmp/temporary- can't used to 
recovery either)

 

Maybe temp checkpoint dir should be deleted on JVM stop?

 

 
{quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not 
Structured Streaming one.
{quote}
 

sorry, I didn't notice this.


was (Author: camper42):
thanks for reply

Two scenarios:
 # {{temp checkpoint dir /tmp/temporary-}}
 # framework restart
 # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- 
can't used to recovery and should be deleted)}}

 
 # {{temp checkpoint dir /tmp/temporary-}}
 # executor stop in some reason
 # executor start on another worker nodes (/tmp/temporary- can't used to 
recovery either)

 

May be temp checkpoint dir should be deleted on JVM stop?

 

 
{quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not 
Structured Streaming one.
{quote}
 

sorry, I didn't notice this.

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-19 Thread Fengyu Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725528#comment-16725528
 ] 

Fengyu Cao commented on SPARK-26389:


thanks for reply

Two scenarios:
 # {{temp checkpoint dir /tmp/temporary-}}
 # framework restart
 # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- 
can't used to recovery and should be deleted)}}

 
 # {{temp checkpoint dir /tmp/temporary-}}
 # executor stop in some reason
 # executor start on another worker nodes (/tmp/temporary- can't used to 
recovery either)

 

May be temp checkpoint dir should be deleted on JVM stop?

 

 
{quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not 
Structured Streaming one.
{quote}
 

sorry, I didn't notice this.

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-17 Thread Fengyu Cao (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengyu Cao updated SPARK-26389:
---
Description: 
{{spark-submit --master mesos:// -conf 
spark.streaming.stopGracefullyOnShutdown=true }}

CTRL-C, framework shutdown

{{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
org.apache.spark.SparkException: Writing job aborted.}}

{{/tmp/temporary- on executor not deleted due to 
org.apache.spark.SparkException: Writing job aborted., and this temp checkpoint 
can't used to recovery.}}

 

  was:
{{spark-submit --master mesos:// -conf 
spark.streaming.stopGracefullyOnShutdown=true }}

CTRL-C, framework shutdown

{{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error}}
{{ org.apache.spark.SparkException: Writing job aborted.}}

{{/tmp/temporary- on executor not deleted due to 
org.apache.spark.SparkException: Writing job aborted., and this temp checkpoint 
can't used to recovery.}}

 


> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-17 Thread Fengyu Cao (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengyu Cao updated SPARK-26389:
---
Description: 
`spark-submit --master mesos://- -conf 
spark.streaming.stopGracefullyOnShutdown=true `

CTRL-C, framework shutdown

18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error
 org.apache.spark.SparkException: Writing job aborted.

/tmp/temporary- on executor not deleted due to 
`org.apache.spark.SparkException: Writing job aborted.`, and this temp 
checkpoint can't used to recovery.

 

  was:
spark-submit --master mesos:// --conf 
spark.streaming.stopGracefullyOnShutdown=true 

CTRL-C, framework shutdown

18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error
org.apache.spark.SparkException: Writing job aborted.

/tmp/temporary- on executor not deleted due to 
`org.apache.spark.SparkException: Writing job aborted.`, and this temp 
checkpoint can't used to recovery.

 


> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> `spark-submit --master mesos://- -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>`
> CTRL-C, framework shutdown
> 18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error
>  org.apache.spark.SparkException: Writing job aborted.
> /tmp/temporary- on executor not deleted due to 
> `org.apache.spark.SparkException: Writing job aborted.`, and this temp 
> checkpoint can't used to recovery.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-17 Thread Fengyu Cao (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengyu Cao updated SPARK-26389:
---
Description: 
{{spark-submit --master mesos:// -conf 
spark.streaming.stopGracefullyOnShutdown=true }}

CTRL-C, framework shutdown

{{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error}}
{{ org.apache.spark.SparkException: Writing job aborted.}}

{{/tmp/temporary- on executor not deleted due to 
org.apache.spark.SparkException: Writing job aborted., and this temp checkpoint 
can't used to recovery.}}

 

  was:
`spark-submit --master mesos://- -conf 
spark.streaming.stopGracefullyOnShutdown=true `

CTRL-C, framework shutdown

18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error
 org.apache.spark.SparkException: Writing job aborted.

/tmp/temporary- on executor not deleted due to 
`org.apache.spark.SparkException: Writing job aborted.`, and this temp 
checkpoint can't used to recovery.

 


> temp checkpoint folder at executor should be deleted on graceful shutdown
> -
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Fengyu Cao
>Priority: Major
>
> {{spark-submit --master mesos:// -conf 
> spark.streaming.stopGracefullyOnShutdown=true  framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error}}
> {{ org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary- on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2018-12-17 Thread Fengyu Cao (JIRA)
Fengyu Cao created SPARK-26389:
--

 Summary: temp checkpoint folder at executor should be deleted on 
graceful shutdown
 Key: SPARK-26389
 URL: https://issues.apache.org/jira/browse/SPARK-26389
 Project: Spark
  Issue Type: Bug
  Components: Structured Streaming
Affects Versions: 2.4.0
Reporter: Fengyu Cao


spark-submit --master mesos:// --conf 
spark.streaming.stopGracefullyOnShutdown=true 

CTRL-C, framework shutdown

18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error
org.apache.spark.SparkException: Writing job aborted.

/tmp/temporary- on executor not deleted due to 
`org.apache.spark.SparkException: Writing job aborted.`, and this temp 
checkpoint can't used to recovery.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org