[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2021-11-07 Thread Jin Xing (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440194#comment-17440194 ] Jin Xing commented on SPARK-30602: -- Hi [~mshen]  Thanks a lot for the feature of Remot

[jira] [Commented] (SPARK-27696) kubernetes driver pod not deleted after finish.

2021-10-19 Thread Jin Xing (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430971#comment-17430971 ] Jin Xing commented on SPARK-27696: -- We also suffer from this issue. From our side, we h

[jira] [Commented] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2018-10-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648880#comment-16648880 ] jin xing commented on SPARK-18134: -- [~maropu] I split this JIRA to two subtasks: SPARK-

[jira] [Created] (SPARK-25725) Add a new expression "SortMaps" to sort the entries inside a "MapData"

2018-10-13 Thread jin xing (JIRA)
jin xing created SPARK-25725: Summary: Add a new expression "SortMaps" to sort the entries inside a "MapData" Key: SPARK-25725 URL: https://issues.apache.org/jira/browse/SPARK-25725 Project: Spark

[jira] [Created] (SPARK-25724) Add sorting functionality in MapType

2018-10-13 Thread jin xing (JIRA)
jin xing created SPARK-25724: Summary: Add sorting functionality in MapType Key: SPARK-25724 URL: https://issues.apache.org/jira/browse/SPARK-25724 Project: Spark Issue Type: Sub-task C

[jira] [Created] (SPARK-24379) BroadcastExchangeExec should catch SparkOutOfMemory and re-throw SparkFatalException, which wraps SparkOutOfMemory inside.

2018-05-24 Thread jin xing (JIRA)
jin xing created SPARK-24379: Summary: BroadcastExchangeExec should catch SparkOutOfMemory and re-throw SparkFatalException, which wraps SparkOutOfMemory inside. Key: SPARK-24379 URL: https://issues.apache.org/jira/br

[jira] [Updated] (SPARK-24379) BroadcastExchangeExec should catch SparkOutOfMemory and re-throw SparkFatalException, which wraps SparkOutOfMemory inside.

2018-05-24 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-24379: - Description: After SPARK-22827, Spark won't fails the entire executor but only fails the task suffering

[jira] [Created] (SPARK-24294) Throw SparkException when OOM in BroadcastExchangeExec

2018-05-16 Thread jin xing (JIRA)
jin xing created SPARK-24294: Summary: Throw SparkException when OOM in BroadcastExchangeExec Key: SPARK-24294 URL: https://issues.apache.org/jira/browse/SPARK-24294 Project: Spark Issue Type: Bu

[jira] [Created] (SPARK-24240) Add a config to control whether InMemoryFileIndex should update cache when refresh.

2018-05-10 Thread jin xing (JIRA)
jin xing created SPARK-24240: Summary: Add a config to control whether InMemoryFileIndex should update cache when refresh. Key: SPARK-24240 URL: https://issues.apache.org/jira/browse/SPARK-24240 Project:

[jira] [Created] (SPARK-24193) Sort by disk when number of limit is big in TakeOrderedAndProjectExec

2018-05-06 Thread jin xing (JIRA)
jin xing created SPARK-24193: Summary: Sort by disk when number of limit is big in TakeOrderedAndProjectExec Key: SPARK-24193 URL: https://issues.apache.org/jira/browse/SPARK-24193 Project: Spark

[jira] [Updated] (SPARK-24143) filter empty blocks when convert mapstatus to (blockId, size) pair

2018-05-01 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-24143: - Issue Type: Bug (was: New Feature) > filter empty blocks when convert mapstatus to (blockId, size) pair

[jira] [Created] (SPARK-24143) filter empty blocks when convert mapstatus to (blockId, size) pair

2018-05-01 Thread jin xing (JIRA)
jin xing created SPARK-24143: Summary: filter empty blocks when convert mapstatus to (blockId, size) pair Key: SPARK-24143 URL: https://issues.apache.org/jira/browse/SPARK-24143 Project: Spark I

[jira] [Updated] (SPARK-23948) Trigger mapstage's job listener in submitMissingTasks

2018-04-10 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23948: - Description: SparkContext submitted a map stage from "submitMapStage" to DAGScheduler,  "markMapStageJobA

[jira] [Updated] (SPARK-23948) Trigger mapstage's job listener in submitMissingTasks

2018-04-09 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23948: - Description: SparkContext submitted a map stage from "submitMapStage" to DAGScheduler,  "markMapStageJobA

[jira] [Created] (SPARK-23948) Trigger mapstage's job listener in submitMissingTasks

2018-04-09 Thread jin xing (JIRA)
jin xing created SPARK-23948: Summary: Trigger mapstage's job listener in submitMissingTasks Key: SPARK-23948 URL: https://issues.apache.org/jira/browse/SPARK-23948 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-23669) Executors fetch jars and name the jars with md5 prefix

2018-03-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23669: - Description: In our cluster, there are lots of UDF jars, some of them have the same filename but differe

[jira] [Updated] (SPARK-23669) Executors fetch jars and name the jars with md5 prefix

2018-03-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23669: - Description: In our cluster, there are lots of UDF jars, some of them have the same filename but differe

[jira] [Created] (SPARK-23669) Executors fetch jars and name the jars with md5 prefix

2018-03-13 Thread jin xing (JIRA)
jin xing created SPARK-23669: Summary: Executors fetch jars and name the jars with md5 prefix Key: SPARK-23669 URL: https://issues.apache.org/jira/browse/SPARK-23669 Project: Spark Issue Type: Ne

[jira] [Updated] (SPARK-23669) Executors fetch jars and name the jars with md5 prefix

2018-03-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23669: - Description: In our cluster, there are lots of UDF jars, some of them have the same filename but differe

[jira] [Updated] (SPARK-23669) Executors fetch jars and name the jars with md5 prefix

2018-03-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23669: - Environment: (was: In our cluster, there are lots of UDF jars, some of them have the same filename bu

[jira] [Commented] (SPARK-23637) Yarn might allocate more resource if a same executor is killed multiple times.

2018-03-08 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392457#comment-16392457 ] jin xing commented on SPARK-23637: -- PR here: https://github.com/apache/spark/pull/20781

[jira] [Updated] (SPARK-23637) Yarn might allocate more resource if a same executor is killed multiple times.

2018-03-08 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23637: - Description: *{{YarnAllocator}}* uses *{{numExecutorsRunning}}* to track the number of running executor.

[jira] [Updated] (SPARK-23637) Yarn might allocate more resource if a same executor is killed multiple times.

2018-03-08 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23637: - Description: YarnAllocator}} uses {{numExecutorsRunning to track the number of running executor.

[jira] [Updated] (SPARK-23637) Yarn might allocate more resource if a same executor is killed multiple times.

2018-03-08 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23637: - Description: {{YarnAllocator }}uses {{numExecutorsRunning}} to track the number of running executor. {{n

[jira] [Created] (SPARK-23637) Yarn might allocate more resource if a same executor is killed multiple times.

2018-03-08 Thread jin xing (JIRA)
jin xing created SPARK-23637: Summary: Yarn might allocate more resource if a same executor is killed multiple times. Key: SPARK-23637 URL: https://issues.apache.org/jira/browse/SPARK-23637 Project: Spark

[jira] [Updated] (SPARK-23637) Yarn might allocate more resource if a same executor is killed multiple times.

2018-03-08 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-23637: - Description: {{YarnAllocator}} uses {{numExecutorsRunning}} to track the number of running executor. {{

[jira] [Created] (SPARK-23524) Big local shuffle blocks should not be checked for corruption.

2018-02-27 Thread jin xing (JIRA)
jin xing created SPARK-23524: Summary: Big local shuffle blocks should not be checked for corruption. Key: SPARK-23524 URL: https://issues.apache.org/jira/browse/SPARK-23524 Project: Spark Issue

[jira] [Created] (SPARK-22676) Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true

2017-12-02 Thread jin xing (JIRA)
jin xing created SPARK-22676: Summary: Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true Key: SPARK-22676 URL: https://issues.apache.org/jira/browse/SPARK-22676 Project: Spa

[jira] [Updated] (SPARK-22435) Support processing array and map type using script

2017-11-03 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-22435: - Priority: Major (was: Critical) > Support processing array and map type using script > -

[jira] [Created] (SPARK-22435) Support processing array and map type using script

2017-11-03 Thread jin xing (JIRA)
jin xing created SPARK-22435: Summary: Support processing array and map type using script Key: SPARK-22435 URL: https://issues.apache.org/jira/browse/SPARK-22435 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-22384) Refine partition pruning when attribute is wrapped in Cast

2017-10-28 Thread jin xing (JIRA)
jin xing created SPARK-22384: Summary: Refine partition pruning when attribute is wrapped in Cast Key: SPARK-22384 URL: https://issues.apache.org/jira/browse/SPARK-22384 Project: Spark Issue Type

[jira] [Created] (SPARK-22350) Select grouping__id from subquery

2017-10-25 Thread jin xing (JIRA)
jin xing created SPARK-22350: Summary: Select grouping__id from subquery Key: SPARK-22350 URL: https://issues.apache.org/jira/browse/SPARK-22350 Project: Spark Issue Type: Improvement C

[jira] [Created] (SPARK-22334) Check table size from Hdfs in case the size in metastore is wrong.

2017-10-23 Thread jin xing (JIRA)
jin xing created SPARK-22334: Summary: Check table size from Hdfs in case the size in metastore is wrong. Key: SPARK-22334 URL: https://issues.apache.org/jira/browse/SPARK-22334 Project: Spark I

[jira] [Updated] (SPARK-22334) Check table size from HDFS in case the size in metastore is wrong.

2017-10-23 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-22334: - Summary: Check table size from HDFS in case the size in metastore is wrong. (was: Check table size from

[jira] [Created] (SPARK-21993) Close CliSessionState in shutdown hook of SparkSQLCLIDriver

2017-09-13 Thread jin xing (JIRA)
jin xing created SPARK-21993: Summary: Close CliSessionState in shutdown hook of SparkSQLCLIDriver Key: SPARK-21993 URL: https://issues.apache.org/jira/browse/SPARK-21993 Project: Spark Issue Ty

[jira] [Created] (SPARK-21916) Set isolationOn=true when create client to remote hive metastore

2017-09-04 Thread jin xing (JIRA)
jin xing created SPARK-21916: Summary: Set isolationOn=true when create client to remote hive metastore Key: SPARK-21916 URL: https://issues.apache.org/jira/browse/SPARK-21916 Project: Spark Iss

[jira] [Created] (SPARK-21874) Support changing database when rename table.

2017-08-30 Thread jin xing (JIRA)
jin xing created SPARK-21874: Summary: Support changing database when rename table. Key: SPARK-21874 URL: https://issues.apache.org/jira/browse/SPARK-21874 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21649) Support writing data into hive bucket table.

2017-08-06 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116132#comment-16116132 ] jin xing commented on SPARK-21649: -- made a pr: https://github.com/apache/spark/pull/1886

[jira] [Created] (SPARK-21649) Support writing data into hive bucket table.

2017-08-06 Thread jin xing (JIRA)
jin xing created SPARK-21649: Summary: Support writing data into hive bucket table. Key: SPARK-21649 URL: https://issues.apache.org/jira/browse/SPARK-21649 Project: Spark Issue Type: New Feature

[jira] [Closed] (SPARK-21509) Add a config to enable adaptive query execution only for the last query execution.

2017-07-27 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing closed SPARK-21509. Resolution: Won't Fix > Add a config to enable adaptive query execution only for the last query > executi

[jira] [Commented] (SPARK-21445) NotSerializableException thrown by UTF8String.IntWrapper

2017-07-25 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101224#comment-16101224 ] jin xing commented on SPARK-21445: -- Sorry, I report the exception by mistake. With the c

[jira] [Comment Edited] (SPARK-21530) Update description of spark.shuffle.maxChunksBeingTransferred

2017-07-25 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100236#comment-16100236 ] jin xing edited comment on SPARK-21530 at 7/26/17 12:56 AM: I

[jira] [Commented] (SPARK-21530) Update description of spark.shuffle.maxChunksBeingTransferred

2017-07-25 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100236#comment-16100236 ] jin xing commented on SPARK-21530: -- I will send follow-up PR soon. Thanks [~tgraves] >

[jira] [Commented] (SPARK-21445) NotSerializableException thrown by UTF8String.IntWrapper

2017-07-25 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099916#comment-16099916 ] jin xing commented on SPARK-21445: -- I'm not sure how to reproduce, I will try. > NotSer

[jira] [Commented] (SPARK-21445) NotSerializableException thrown by UTF8String.IntWrapper

2017-07-25 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099722#comment-16099722 ] jin xing commented on SPARK-21445: -- With this change, I'm still seeing exception below:

[jira] [Created] (SPARK-21509) Add a config to enable adaptive query execution only for the last query execution.

2017-07-22 Thread jin xing (JIRA)
jin xing created SPARK-21509: Summary: Add a config to enable adaptive query execution only for the last query execution. Key: SPARK-21509 URL: https://issues.apache.org/jira/browse/SPARK-21509 Project:

[jira] [Created] (SPARK-21414) Buffer in SlidingWindowFunctionFrame could be big though window is small

2017-07-13 Thread jin xing (JIRA)
jin xing created SPARK-21414: Summary: Buffer in SlidingWindowFunctionFrame could be big though window is small Key: SPARK-21414 URL: https://issues.apache.org/jira/browse/SPARK-21414 Project: Spark

[jira] [Created] (SPARK-21343) Refine the document for spark.reducer.maxReqSizeShuffleToMem

2017-07-07 Thread jin xing (JIRA)
jin xing created SPARK-21343: Summary: Refine the document for spark.reducer.maxReqSizeShuffleToMem Key: SPARK-21343 URL: https://issues.apache.org/jira/browse/SPARK-21343 Project: Spark Issue T

[jira] [Created] (SPARK-21342) Fix DownloadCallback to work well with RetryingBlockFetcher

2017-07-07 Thread jin xing (JIRA)
jin xing created SPARK-21342: Summary: Fix DownloadCallback to work well with RetryingBlockFetcher Key: SPARK-21342 URL: https://issues.apache.org/jira/browse/SPARK-21342 Project: Spark Issue Ty

[jira] [Updated] (SPARK-21315) Skip some spill files when generateIterator(startIndex) in ExternalAppendOnlyUnsafeRowArray.

2017-07-05 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-21315: - Description: In current code, it is expensive to use {{UnboundedFollowingWindowFunctionFrame}}, because i

[jira] [Created] (SPARK-21315) Skip some spill files when generateIterator(startIndex) in ExternalAppendOnlyUnsafeRowArray.

2017-07-05 Thread jin xing (JIRA)
jin xing created SPARK-21315: Summary: Skip some spill files when generateIterator(startIndex) in ExternalAppendOnlyUnsafeRowArray. Key: SPARK-21315 URL: https://issues.apache.org/jira/browse/SPARK-21315

[jira] [Updated] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-21270: - Issue Type: Improvement (was: Bug) > Improvement for memory config. > -- > >

[jira] [Commented] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070264#comment-16070264 ] jin xing commented on SPARK-21270: -- cc [~rxin] [~cloud_fan] [~joshrosen] > Improvement

[jira] [Created] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread jin xing (JIRA)
jin xing created SPARK-21270: Summary: Improvement for memory config. Key: SPARK-21270 URL: https://issues.apache.org/jira/browse/SPARK-21270 Project: Spark Issue Type: Bug Components:

[jira] [Created] (SPARK-21262) Stop sending 'stream request' when shuffle blocks.

2017-06-29 Thread jin xing (JIRA)
jin xing created SPARK-21262: Summary: Stop sending 'stream request' when shuffle blocks. Key: SPARK-21262 URL: https://issues.apache.org/jira/browse/SPARK-21262 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21240) Fix code style for constructing and stopping a SparkContext in UT

2017-06-28 Thread jin xing (JIRA)
jin xing created SPARK-21240: Summary: Fix code style for constructing and stopping a SparkContext in UT Key: SPARK-21240 URL: https://issues.apache.org/jira/browse/SPARK-21240 Project: Spark Is

[jira] [Created] (SPARK-21236) Make the threshold of using HighlyCompressedStatus configurable.

2017-06-27 Thread jin xing (JIRA)
jin xing created SPARK-21236: Summary: Make the threshold of using HighlyCompressedStatus configurable. Key: SPARK-21236 URL: https://issues.apache.org/jira/browse/SPARK-21236 Project: Spark Iss

[jira] [Created] (SPARK-21194) Fail the putNullmethod when containsNull=false.

2017-06-23 Thread jin xing (JIRA)
jin xing created SPARK-21194: Summary: Fail the putNullmethod when containsNull=false. Key: SPARK-21194 URL: https://issues.apache.org/jira/browse/SPARK-21194 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21175) Slow down "open blocks" on shuffle service when memory shortage to avoid OOM.

2017-06-22 Thread jin xing (JIRA)
jin xing created SPARK-21175: Summary: Slow down "open blocks" on shuffle service when memory shortage to avoid OOM. Key: SPARK-21175 URL: https://issues.apache.org/jira/browse/SPARK-21175 Project: Spark

[jira] [Commented] (SPARK-21047) Add test suites for complicated cases in ColumnarBatchSuite

2017-06-16 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16051769#comment-16051769 ] jin xing commented on SPARK-21047: -- [~kiszk] Would you mind if I make a try for this JIR

[jira] [Commented] (SPARK-21021) Reading partitioned parquet does not respect specified schema column order

2017-06-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047815#comment-16047815 ] jin xing commented on SPARK-21021: -- I think the reason of the incompatibility is that th

[jira] [Commented] (SPARK-19462) when spark.sql.adaptive.enabled is enabled, DF is not resilient to node/container failure

2017-06-08 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042440#comment-16042440 ] jin xing commented on SPARK-19462: -- [~ianlcsd] I can reproduce this bug in my env. I mad

[jira] [Updated] (SPARK-20994) Alleviate memory pressure in StreamManager

2017-06-07 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20994: - Description: In my cluster, we are suffering from OOM of shuffle-service. We found that a lot of executor

[jira] [Created] (SPARK-20994) Alleviate memory pressure in StreamManager

2017-06-05 Thread jin xing (JIRA)
jin xing created SPARK-20994: Summary: Alleviate memory pressure in StreamManager Key: SPARK-20994 URL: https://issues.apache.org/jira/browse/SPARK-20994 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-20985) Improve KryoSerializerResizableOutputSuite

2017-06-05 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20985: - Description: SparkContext should always be stopped after using, thus other tests won't complain that ther

[jira] [Created] (SPARK-20985) Improve KryoSerializerResizableOutputSuite

2017-06-05 Thread jin xing (JIRA)
jin xing created SPARK-20985: Summary: Improve KryoSerializerResizableOutputSuite Key: SPARK-20985 URL: https://issues.apache.org/jira/browse/SPARK-20985 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-20801) Store accurate size of blocks in MapStatus when it's above threshold.

2017-05-18 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20801: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-19659 > Store accurate size of blocks in M

[jira] [Created] (SPARK-20801) Store accurate size of blocks in MapStatus when it's above threshold.

2017-05-18 Thread jin xing (JIRA)
jin xing created SPARK-20801: Summary: Store accurate size of blocks in MapStatus when it's above threshold. Key: SPARK-20801 URL: https://issues.apache.org/jira/browse/SPARK-20801 Project: Spark

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-24 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981124#comment-15981124 ] jin xing commented on SPARK-20426: -- [~jerryshao] I think lazy initialization can resolv

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978431#comment-15978431 ] jin xing commented on SPARK-20426: -- Yes, the applications are requesting too many shuffl

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978409#comment-15978409 ] jin xing commented on SPARK-20426: -- [~jerryshao] {quote} Brain storm: The problem here i

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978398#comment-15978398 ] jin xing commented on SPARK-20426: -- [~jerryshao] Thanks a lot for looking into this jira

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978394#comment-15978394 ] jin xing commented on SPARK-20426: -- Currently in the code, shuffle-read process is like

[jira] [Updated] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20426: - Description: Spark jobs are running on yarn cluster in my warehouse. We enabled the external shuffle serv

[jira] [Updated] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20426: - Description: Spark jobs are running on yarn cluster in my warehouse. We enabled the external shuffle serv

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978345#comment-15978345 ] jin xing commented on SPARK-20426: -- That's inside NodeManager(not application memory). W

[jira] [Comment Edited] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978345#comment-15978345 ] jin xing edited comment on SPARK-20426 at 4/21/17 9:00 AM: --- Tha

[jira] [Comment Edited] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978314#comment-15978314 ] jin xing edited comment on SPARK-20426 at 4/21/17 8:35 AM: --- I p

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978314#comment-15978314 ] jin xing commented on SPARK-20426: -- I posted 2 screenshots. External shuffle service of

[jira] [Updated] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20426: - Attachment: screenshot-2.png > OneForOneStreamManager occupies too much memory. > ---

[jira] [Updated] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20426: - Attachment: screenshot-1.png > OneForOneStreamManager occupies too much memory. > ---

[jira] [Commented] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978278#comment-15978278 ] jin xing commented on SPARK-20426: -- [~srowen] Thanks a lot for quick reply :) With *spa

[jira] [Updated] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20426: - Issue Type: Improvement (was: Bug) > OneForOneStreamManager occupies too much memory. >

[jira] [Created] (SPARK-20426) OneForOneStreamManager occupies too much memory.

2017-04-21 Thread jin xing (JIRA)
jin xing created SPARK-20426: Summary: OneForOneStreamManager occupies too much memory. Key: SPARK-20426 URL: https://issues.apache.org/jira/browse/SPARK-20426 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-17 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971025#comment-15971025 ] jin xing commented on SPARK-19659: -- [~cloud_fan] I refined the the pr. In current change

[jira] [Updated] (SPARK-20333) Fix HashPartitioner in DAGSchedulerSuite

2017-04-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20333: - Description: In test "don't submit stage until its dependencies map outputs are registered (SPARK-5259)

[jira] [Updated] (SPARK-20333) Fix HashPartitioner in DAGSchedulerSuite

2017-04-13 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20333: - Description: In test "don't submit stage until its dependencies map outputs are registered (SPARK-5259)",

[jira] [Created] (SPARK-20333) Fix HashPartitioner in DAGSchedulerSuite

2017-04-13 Thread jin xing (JIRA)
jin xing created SPARK-20333: Summary: Fix HashPartitioner in DAGSchedulerSuite Key: SPARK-20333 URL: https://issues.apache.org/jira/browse/SPARK-20333 Project: Spark Issue Type: Bug Co

[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965452#comment-15965452 ] jin xing commented on SPARK-19659: -- Yes, I think it's a good idea to leverage memory man

[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964596#comment-15964596 ] jin xing commented on SPARK-19659: -- *bytesShuffleToMemory* is different from *bytesInFli

[jira] [Comment Edited] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964587#comment-15964587 ] jin xing edited comment on SPARK-19659 at 4/11/17 4:13 PM: --- [~c

[jira] [Comment Edited] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964587#comment-15964587 ] jin xing edited comment on SPARK-19659 at 4/11/17 4:11 PM: --- [~c

[jira] [Comment Edited] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964587#comment-15964587 ] jin xing edited comment on SPARK-19659 at 4/11/17 4:11 PM: --- [~c

[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964587#comment-15964587 ] jin xing commented on SPARK-19659: -- [~cloud_fan] Thanks a lot for taking look into this

[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-04-11 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964508#comment-15964508 ] jin xing commented on SPARK-19659: -- [~irashid] Tracking memory used by Netty by swapping

[jira] [Updated] (SPARK-20288) Improve BasicSchedulerIntegrationSuite "multi-stage job"

2017-04-10 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20288: - Description: ShuffleId is determined before job submitted. But it's hard to predict stageId by shuffleId

[jira] [Created] (SPARK-20288) Improve BasicSchedulerIntegrationSuite "multi-stage job"

2017-04-10 Thread jin xing (JIRA)
jin xing created SPARK-20288: Summary: Improve BasicSchedulerIntegrationSuite "multi-stage job" Key: SPARK-20288 URL: https://issues.apache.org/jira/browse/SPARK-20288 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20219) Schedule tasks based on size of input from ScheduledRDD

2017-04-07 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960608#comment-15960608 ] jin xing commented on SPARK-20219: -- [~kayousterhout] [~irashid] Thanks a lot for taking

[jira] [Updated] (SPARK-20219) Schedule tasks based on size of input from ScheduledRDD

2017-04-07 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-20219: - Attachment: screenshot-1.png > Schedule tasks based on size of input from ScheduledRDD >

[jira] [Created] (SPARK-20219) Schedule tasks based on size of input from ScheduledRDD

2017-04-04 Thread jin xing (JIRA)
jin xing created SPARK-20219: Summary: Schedule tasks based on size of input from ScheduledRDD Key: SPARK-20219 URL: https://issues.apache.org/jira/browse/SPARK-20219 Project: Spark Issue Type: I

  1   2   >