[ https://issues.apache.org/jira/browse/SPARK-25921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bago Amirbekian updated SPARK-25921: ------------------------------------ Description: Running a barrier job after a normal spark job causes the barrier job to run without a BarrierTaskContext. Here is some code to reproduce. {code:java} def task(*args): from pyspark import BarrierTaskContext context = BarrierTaskContext.get() context.barrier() print("in barrier phase") context.barrier() return [] a = sc.parallelize(list(range(4))).map(lambda x: x ** 2).collect() assert a == [0, 1, 4, 9] b = sc.parallelize(list(range(4)), 4).barrier().mapPartitions(task).collect() {code} was: Running a barrier job after a normal spark job causes the barrier job to run without a BarrierTaskContext. Here is some code to reproduce. {code:java} def task(*args): from pyspark import BarrierTaskContext context = BarrierTaskContext.get() context.barrier() print("in barrier phase") context.barrier() return [] a = sc.parallelize(list(range(4))).map(lambda x: x ** 2).collect() assert a == [0, 1, 4, 9] b = sc.parallelize(list(range(4)), 4).barrier().mapPartitions(task).collect() {code} > Python worker reuse causes Barrier tasks to run without BarrierTaskContext > -------------------------------------------------------------------------- > > Key: SPARK-25921 > URL: https://issues.apache.org/jira/browse/SPARK-25921 > Project: Spark > Issue Type: Bug > Components: PySpark, Spark Core > Affects Versions: 2.4.0 > Reporter: Bago Amirbekian > Priority: Major > > Running a barrier job after a normal spark job causes the barrier job to run > without a BarrierTaskContext. Here is some code to reproduce. > > {code:java} > def task(*args): > from pyspark import BarrierTaskContext > context = BarrierTaskContext.get() > context.barrier() > print("in barrier phase") > context.barrier() > return [] > a = sc.parallelize(list(range(4))).map(lambda x: x ** 2).collect() > assert a == [0, 1, 4, 9] > b = sc.parallelize(list(range(4)), 4).barrier().mapPartitions(task).collect() > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org