[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2021-04-16 Thread Lianhui Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-30602: - Description: In a large deployment of a Spark compute infrastructure, Spark shuffle is

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2021-04-16 Thread Lianhui Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-30602: - Description: 白月山火禾In a large deployment of a Spark compute infrastructure, Spark shuffle is

[jira] [Updated] (SPARK-20986) Reset table's statistics after PruneFileSourcePartitions rule.

2017-06-05 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-20986: - Description: After PruneFileSourcePartitions rule, It needs reset table's statistics because

[jira] [Created] (SPARK-20986) Reset table's statistics after PruneFileSourcePartitions rule.

2017-06-05 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-20986: Summary: Reset table's statistics after PruneFileSourcePartitions rule. Key: SPARK-20986 URL: https://issues.apache.org/jira/browse/SPARK-20986 Project: Spark

[jira] [Updated] (SPARK-15616) CatalogRelation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2017-06-04 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15616: - Summary: CatalogRelation should fallback to HDFS size of partitions that are involved in Query

[jira] [Commented] (SPARK-14560) Cooperative Memory Management for Spillables

2017-03-30 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948578#comment-15948578 ] Lianhui Wang commented on SPARK-14560: -- [~darshankhamar123] [~mhornbech] Are you using

[jira] [Commented] (SPARK-14560) Cooperative Memory Management for Spillables

2016-11-23 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690589#comment-15690589 ] Lianhui Wang commented on SPARK-14560: -- Can you provide some debug log of TaskMemoryManager? I think

[jira] [Commented] (SPARK-14560) Cooperative Memory Management for Spillables

2016-11-16 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670007#comment-15670007 ] Lianhui Wang commented on SPARK-14560: -- Now this issue is not in Branch-1.6. You can see

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-31 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622645#comment-15622645 ] Lianhui Wang commented on SPARK-15616: -- For 2.0, I have created a new branch

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-29 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15618476#comment-15618476 ] Lianhui Wang commented on SPARK-15616: -- I have updated the code and fixed the problem that you have

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-29 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15618475#comment-15618475 ] Lianhui Wang commented on SPARK-15616: -- I have updated the code and fixed the problem that you have

[jira] [Issue Comment Deleted] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-29 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15616: - Comment: was deleted (was: I have updated the code and fixed the problem that you have pointed

[jira] [Comment Edited] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-27 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613846#comment-15613846 ] Lianhui Wang edited comment on SPARK-15616 at 10/28/16 12:54 AM: - Yes, I

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-27 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613846#comment-15613846 ] Lianhui Wang commented on SPARK-15616: -- Yes, I think it can. But now the PR is based on branch 2.0,

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-26 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610455#comment-15610455 ] Lianhui Wang commented on SPARK-15616: -- if filter is for partition key, So this issue can did it for

[jira] [Comment Edited] (SPARK-2666) Always try to cancel running tasks when a stage is marked as zombie

2016-07-20 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387145#comment-15387145 ] Lianhui Wang edited comment on SPARK-2666 at 7/21/16 4:38 AM: -- Thanks. I

[jira] [Commented] (SPARK-2666) Always try to cancel running tasks when a stage is marked as zombie

2016-07-20 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387145#comment-15387145 ] Lianhui Wang commented on SPARK-2666: - I think what [~irashid] said is more about non-external

[jira] [Updated] (SPARK-16649) Push partition predicates down into metastore for OptimizeMetadataOnlyQuery

2016-07-20 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16649: - Description: SPARK-6910 has supported for pushing partition predicates down into the Hive

[jira] [Updated] (SPARK-16649) Push partition predicates down into metastore for OptimizeMetadataOnlyQuery

2016-07-20 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16649: - Summary: Push partition predicates down into metastore for OptimizeMetadataOnlyQuery (was: Push

[jira] [Created] (SPARK-16649) Push partition predicates down into the Hive metastore for OptimizeMetadataOnlyQuery

2016-07-20 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-16649: Summary: Push partition predicates down into the Hive metastore for OptimizeMetadataOnlyQuery Key: SPARK-16649 URL: https://issues.apache.org/jira/browse/SPARK-16649

[jira] [Commented] (SPARK-2666) Always try to cancel running tasks when a stage is marked as zombie

2016-07-20 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385510#comment-15385510 ] Lianhui Wang commented on SPARK-2666: - [~tgraves] Sorry for late reply. In

[jira] [Updated] (SPARK-16497) Don't throw an exception if drop non-existent TABLE/VIEW/Function/Partitions

2016-07-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16497: - Summary: Don't throw an exception if drop non-existent TABLE/VIEW/Function/Partitions (was:

[jira] [Created] (SPARK-16497) Don't throw an exception for drop non-exist TABLE/VIEW/Function/Partitions

2016-07-12 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-16497: Summary: Don't throw an exception for drop non-exist TABLE/VIEW/Function/Partitions Key: SPARK-16497 URL: https://issues.apache.org/jira/browse/SPARK-16497 Project:

[jira] [Updated] (SPARK-16456) Reuse the uncorrelated scalar subqueries with the same logical plan in a query

2016-07-08 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16456: - Summary: Reuse the uncorrelated scalar subqueries with the same logical plan in a query (was:

[jira] [Updated] (SPARK-16456) Reuse the uncorrelated scalar subqueries with the same logical plan in a query

2016-07-08 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16456: - Description: In TPCDS Q14, the same physical plan of uncorrelated scalar subqueries from a CTE

[jira] [Updated] (SPARK-15752) Optimize metadata only query that has an aggregate whose children are deterministic project or filter operators

2016-07-06 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15752: - Summary: Optimize metadata only query that has an aggregate whose children are deterministic

[jira] [Updated] (SPARK-16302) Set the right number of partitions for reading data from a local collection.

2016-06-29 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16302: - Summary: Set the right number of partitions for reading data from a local collection. (was: Set

[jira] [Updated] (SPARK-16302) Set the default number of partitions for reading data from a local collection.

2016-06-29 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-16302: - Summary: Set the default number of partitions for reading data from a local collection. (was:

[jira] [Created] (SPARK-16302) LocalTableScanExec always use defaultParallelism tasks even though it is very small seq.

2016-06-29 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-16302: Summary: LocalTableScanExec always use defaultParallelism tasks even though it is very small seq. Key: SPARK-16302 URL: https://issues.apache.org/jira/browse/SPARK-16302

[jira] [Created] (SPARK-15988) Implement DDL commands: CREATE/DROP TEMPORARY MACRO

2016-06-16 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-15988: Summary: Implement DDL commands: CREATE/DROP TEMPORARY MACRO Key: SPARK-15988 URL: https://issues.apache.org/jira/browse/SPARK-15988 Project: Spark Issue

[jira] [Updated] (SPARK-15752) support optimization for metadata only queries

2016-06-04 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15752: - Description: when query only use metadata (example: partition key), it can return results based

[jira] [Updated] (SPARK-15756) Support command 'create table stored as orcfile/parquetfile/avrofile'

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15756: - Description: Now Spark SQL can support 'create table src stored as orc/parquet/avro' for

[jira] [Updated] (SPARK-15756) Support command 'create table stored as orcfile/parquetfile/avrofile'

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15756: - Summary: Support command 'create table stored as orcfile/parquetfile/avrofile' (was: Support

[jira] [Updated] (SPARK-15756) Support create table stored as orcfile/parquetfile/avrofile

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15756: - Description: in

[jira] [Updated] (SPARK-15756) Support create table stored as orcfile/parquetfile/avrofile

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15756: - Description: in

[jira] [Updated] (SPARK-15756) Support create table stored as orcfile/parquetfile/avrofile

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15756: - Summary: Support create table stored as orcfile/parquetfile/avrofile (was: SQL “stored as

[jira] [Updated] (SPARK-15752) support optimization for metadata only queries

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15752: - Description: when query only use metadata (example: partition key), it can return results based

[jira] [Updated] (SPARK-15752) support optimization for metadata only queries

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15752: - Description: when query just has use metadata (example: partition key), it can return results

[jira] [Updated] (SPARK-15752) support optimization for metadata only queries

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15752: - Summary: support optimization for metadata only queries (was: support optimization for metadata

[jira] [Created] (SPARK-15752) support optimization for metadata only queries.

2016-06-03 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-15752: Summary: support optimization for metadata only queries. Key: SPARK-15752 URL: https://issues.apache.org/jira/browse/SPARK-15752 Project: Spark Issue Type:

[jira] [Updated] (SPARK-15752) support optimization for metadata only queries.

2016-06-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15752: - Issue Type: Improvement (was: Bug) > support optimization for metadata only queries. >

[jira] [Updated] (SPARK-15664) Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib

2016-05-31 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15664: - Description: if sparkContext.set CheckpointDir to another Dir that is not default FileSystem, it

[jira] [Updated] (SPARK-15664) Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib

2016-05-31 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15664: - Description: if sparkContext.set CheckpointDir to another Dir that is not default FileSystem, it

[jira] [Updated] (SPARK-15664) Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib

2016-05-31 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15664: - Component/s: MLlib > Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing >

[jira] [Created] (SPARK-15664) Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib

2016-05-31 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-15664: Summary: Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib Key: SPARK-15664 URL: https://issues.apache.org/jira/browse/SPARK-15664

[jira] [Created] (SPARK-15649) avoid to serialize MetastoreRelation in HiveTableScanExec

2016-05-30 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-15649: Summary: avoid to serialize MetastoreRelation in HiveTableScanExec Key: SPARK-15649 URL: https://issues.apache.org/jira/browse/SPARK-15649 Project: Spark

[jira] [Updated] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-05-27 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15616: - Description: Currently if some partitions of a partitioned table are used in join operation we

[jira] [Created] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-05-27 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-15616: Summary: Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available. Key: SPARK-15616 URL:

[jira] [Updated] (SPARK-15335) Implement TRUNCATE TABLE Command

2016-05-18 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15335: - Summary: Implement TRUNCATE TABLE Command (was: In Spark 2.0 TRUNCATE TABLE is unsupported) >

[jira] [Updated] (SPARK-15246) Fix code style and improve volatile for SPARK-4452

2016-05-10 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-15246: - Summary: Fix code style and improve volatile for SPARK-4452 (was: Fix code style and improve

[jira] [Created] (SPARK-15246) Fix code style and improve volatile for Spillable

2016-05-10 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-15246: Summary: Fix code style and improve volatile for Spillable Key: SPARK-15246 URL: https://issues.apache.org/jira/browse/SPARK-15246 Project: Spark Issue

[jira] [Created] (SPARK-14705) support Multiple FileSystem for YARN STAGING DIR

2016-04-18 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-14705: Summary: support Multiple FileSystem for YARN STAGING DIR Key: SPARK-14705 URL: https://issues.apache.org/jira/browse/SPARK-14705 Project: Spark Issue Type:

[jira] [Closed] (SPARK-12322) recompute an cached RDD partition when getting its block is failed

2015-12-14 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang closed SPARK-12322. Resolution: Invalid > recompute an cached RDD partition when getting its block is failed >

[jira] [Created] (SPARK-12322) recompute an cached RDD partition when getting its block is failed

2015-12-14 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-12322: Summary: recompute an cached RDD partition when getting its block is failed Key: SPARK-12322 URL: https://issues.apache.org/jira/browse/SPARK-12322 Project: Spark

[jira] [Updated] (SPARK-4621) Shuffle index can cached for SortShuffleManager in ExternalShuffle in order to reduce indexFile's io

2015-12-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-4621: Summary: Shuffle index can cached for SortShuffleManager in ExternalShuffle in order to reduce

[jira] [Updated] (SPARK-4621) Shuffle index can be cached for SortShuffleManager in ExternalShuffle in order to reduce indexFile's io

2015-12-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-4621: Description: in ExternalShuffle, we can use LRUCache to store recently finished shuffle index and

[jira] [Updated] (SPARK-4621) Shuffle index can be cached for SortShuffleManager in ExternalShuffle in order to reduce indexFile's io

2015-12-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-4621: Summary: Shuffle index can be cached for SortShuffleManager in ExternalShuffle in order to reduce

[jira] [Updated] (SPARK-12130) Replace shuffleManagerClass with shortShuffleMgrNames in ExternalShuffleBlockResolver

2015-12-03 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-12130: - Component/s: YARN Shuffle > Replace shuffleManagerClass with

[jira] [Created] (SPARK-12130) Replace shuffleManagerClass with shortShuffleMgrNames in ExternalShuffleBlockResolver

2015-12-03 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-12130: Summary: Replace shuffleManagerClass with shortShuffleMgrNames in ExternalShuffleBlockResolver Key: SPARK-12130 URL: https://issues.apache.org/jira/browse/SPARK-12130

[jira] [Updated] (SPARK-11252) andrewor14

2015-10-22 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-11252: - Summary: andrewor14 (was: ExternalShuffleClient should release connection after it had

[jira] [Updated] (SPARK-11252) ShuffleClient should release connection after fetching blocks had been completed in yarn's external shuffle

2015-10-22 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-11252: - Summary: ShuffleClient should release connection after fetching blocks had been completed in

[jira] [Created] (SPARK-11252) ExternalShuffleClient should release connection after it had completed to fetch blocks from yarn's NameManager

2015-10-21 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-11252: Summary: ExternalShuffleClient should release connection after it had completed to fetch blocks from yarn's NameManager Key: SPARK-11252 URL:

[jira] [Created] (SPARK-11026) spark.yarn.user.classpath.first doesn't work for remote addJars

2015-10-09 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-11026: Summary: spark.yarn.user.classpath.first doesn't work for remote addJars Key: SPARK-11026 URL: https://issues.apache.org/jira/browse/SPARK-11026 Project: Spark

[jira] [Created] (SPARK-10775) add search keywords in history page ui

2015-09-23 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-10775: Summary: add search keywords in history page ui Key: SPARK-10775 URL: https://issues.apache.org/jira/browse/SPARK-10775 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-2666) Always try to cancel running tasks when a stage is marked as zombie

2015-09-22 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902459#comment-14902459 ] Lianhui Wang edited comment on SPARK-2666 at 9/22/15 12:22 PM: --- [~imranr]

[jira] [Commented] (SPARK-2666) Always try to cancel running tasks when a stage is marked as zombie

2015-09-22 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902459#comment-14902459 ] Lianhui Wang commented on SPARK-2666: - [~imranr] thanks, i have take a look at

[jira] [Comment Edited] (SPARK-2666) Always try to cancel running tasks when a stage is marked as zombie

2015-09-22 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902459#comment-14902459 ] Lianhui Wang edited comment on SPARK-2666 at 9/22/15 12:11 PM: --- [~imranr]

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN if master not provided in command line

2015-07-16 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629409#comment-14629409 ] Lianhui Wang commented on SPARK-8646: - yes, when i use this command:

[jira] [Comment Edited] (SPARK-8646) PySpark does not run on YARN if master not provided in command line

2015-07-16 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629409#comment-14629409 ] Lianhui Wang edited comment on SPARK-8646 at 7/16/15 8:25 AM: --

[jira] [Comment Edited] (SPARK-8646) PySpark does not run on YARN if master not provided in command line

2015-07-16 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629409#comment-14629409 ] Lianhui Wang edited comment on SPARK-8646 at 7/16/15 8:26 AM: --

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN if master not provided in command line

2015-07-15 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629103#comment-14629103 ] Lianhui Wang commented on SPARK-8646: - yes, when we set master=yarn-client on

[jira] [Issue Comment Deleted] (SPARK-8646) PySpark does not run on YARN if master not provided in command line

2015-07-15 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-8646: Comment: was deleted (was: yes, when we set master=yarn-client on pyspark/SparkContext.py, it do

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN

2015-07-13 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624521#comment-14624521 ] Lianhui Wang commented on SPARK-8646: - [~juliet] from your spark1.4-verbose.log, i

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN

2015-07-13 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625725#comment-14625725 ] Lianhui Wang commented on SPARK-8646: - [~juliet] can you provide your spark-submit

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN

2015-07-09 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620466#comment-14620466 ] Lianhui Wang commented on SPARK-8646: - [~j_houg] can you add --verbose to spark-submit

[jira] [Commented] (SPARK-8646) PySpark does not run on YARN

2015-06-26 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603973#comment-14603973 ] Lianhui Wang commented on SPARK-8646: - from [~juliet] 's logs, i think you miss python

[jira] [Updated] (SPARK-6954) ExecutorAllocationManager can end up requesting a negative number of executors

2015-06-19 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-6954: Summary: ExecutorAllocationManager can end up requesting a negative number of executors (was:

[jira] [Created] (SPARK-8430) in Yarn's shuffle service ExternalShuffleBlockResolver should support UnsafeShuffleManager

2015-06-18 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-8430: --- Summary: in Yarn's shuffle service ExternalShuffleBlockResolver should support UnsafeShuffleManager Key: SPARK-8430 URL: https://issues.apache.org/jira/browse/SPARK-8430

[jira] [Updated] (SPARK-8381) reuse typeConvert when convert Seq[Row] to catalyst type

2015-06-15 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-8381: Description: This method CatalystTypeConverters.convertToCatalyst is slow, so for batch conversion

[jira] [Created] (SPARK-8381) reuse-typeConvert when convert Seq[Row] to CatalystType

2015-06-15 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-8381: --- Summary: reuse-typeConvert when convert Seq[Row] to CatalystType Key: SPARK-8381 URL: https://issues.apache.org/jira/browse/SPARK-8381 Project: Spark Issue

[jira] [Updated] (SPARK-8381) reuse typeConvert when convert Seq[Row] to catalyst type

2015-06-15 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-8381: Summary: reuse typeConvert when convert Seq[Row] to catalyst type (was: reuse-typeConvert when

[jira] [Updated] (SPARK-8381) reuse-typeConvert when convert Seq[Row] to catalyst type

2015-06-15 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-8381: Summary: reuse-typeConvert when convert Seq[Row] to catalyst type (was: reuse-typeConvert when

[jira] [Commented] (SPARK-6700) flaky test: run Python application in yarn-cluster mode

2015-04-06 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480988#comment-14480988 ] Lianhui Wang commented on SPARK-6700: - i do not think this is related to SPARK-6506

[jira] [Comment Edited] (SPARK-6700) flaky test: run Python application in yarn-cluster mode

2015-04-06 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480988#comment-14480988 ] Lianhui Wang edited comment on SPARK-6700 at 4/6/15 6:49 AM: -

[jira] [Comment Edited] (SPARK-6506) python support yarn cluster mode requires SPARK_HOME to be set

2015-03-26 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381844#comment-14381844 ] Lianhui Wang edited comment on SPARK-6506 at 3/26/15 1:18 PM: --

[jira] [Commented] (SPARK-6506) python support yarn cluster mode requires SPARK_HOME to be set

2015-03-26 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381844#comment-14381844 ] Lianhui Wang commented on SPARK-6506: - hi [~tgraves] I use 1.3.0 to run. if i donot

[jira] [Comment Edited] (SPARK-6506) python support yarn cluster mode requires SPARK_HOME to be set

2015-03-26 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381844#comment-14381844 ] Lianhui Wang edited comment on SPARK-6506 at 3/26/15 1:17 PM: --

[jira] [Created] (SPARK-6103) remove unused class to import in EdgeRDDImpl

2015-03-01 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-6103: --- Summary: remove unused class to import in EdgeRDDImpl Key: SPARK-6103 URL: https://issues.apache.org/jira/browse/SPARK-6103 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-6056) Unlimit offHeap memory use cause RM killing the container

2015-02-28 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341498#comment-14341498 ] Lianhui Wang edited comment on SPARK-6056 at 2/28/15 12:36 PM:

[jira] [Commented] (SPARK-6056) Unlimit offHeap memory use cause RM killing the container

2015-02-28 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341498#comment-14341498 ] Lianhui Wang commented on SPARK-6056: - [~carlmartin] what is your executor's memory?

[jira] [Commented] (SPARK-6056) Unlimit offHeap memory use cause RM killing the container

2015-02-27 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341260#comment-14341260 ] Lianhui Wang commented on SPARK-6056: - [~adav] from your given information, when

[jira] [Updated] (SPARK-5763) Sort-based Groupby and Join to resolve skewed data

2015-02-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-5763: Description: In SPARK-4644, it provide a way to resolve skewed data. But when we has more keys

[jira] [Updated] (SPARK-5763) Sort-based Groupby and Join to resolve skewed data

2015-02-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-5763: Description: In SPARK-4644, it provide a way to resolve skewed data. But when we has more keys

[jira] [Comment Edited] (SPARK-5721) Propagate missing external shuffle service errors to client

2015-02-12 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317895#comment-14317895 ] Lianhui Wang edited comment on SPARK-5721 at 2/12/15 9:39 AM: --

[jira] [Created] (SPARK-5763) Sort-based Groupby and Join to resolve skewed data

2015-02-12 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-5763: --- Summary: Sort-based Groupby and Join to resolve skewed data Key: SPARK-5763 URL: https://issues.apache.org/jira/browse/SPARK-5763 Project: Spark Issue Type:

[jira] [Created] (SPARK-5759) ExecutorRunnable should catch YarnException while NMClient start container

2015-02-11 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-5759: --- Summary: ExecutorRunnable should catch YarnException while NMClient start container Key: SPARK-5759 URL: https://issues.apache.org/jira/browse/SPARK-5759 Project:

[jira] [Comment Edited] (SPARK-5227) InputOutputMetricsSuite input metrics when reading text file with multiple splits test fails in branch-1.2 SBT Jenkins build w/hadoop1.0 and hadoop2.0 profiles

2015-02-09 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312163#comment-14312163 ] Lianhui Wang edited comment on SPARK-5227 at 2/9/15 12:16 PM: --

[jira] [Created] (SPARK-5687) in TaskResultGetter need to catch OutOfMemoryError and report failed when it cannot fetch results.

2015-02-09 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-5687: --- Summary: in TaskResultGetter need to catch OutOfMemoryError and report failed when it cannot fetch results. Key: SPARK-5687 URL: https://issues.apache.org/jira/browse/SPARK-5687

[jira] [Commented] (SPARK-5227) InputOutputMetricsSuite input metrics when reading text file with multiple splits test fails in branch-1.2 SBT Jenkins build w/hadoop1.0 and hadoop2.0 profiles

2015-02-09 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312163#comment-14312163 ] Lianhui Wang commented on SPARK-5227: - split size in hadoop's FileInputFormat:

[jira] [Updated] (SPARK-5687) in TaskResultGetter need to catch OutOfMemoryError.

2015-02-09 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lianhui Wang updated SPARK-5687: Description: because in enqueueSuccessfulTask there is another thread to fetch result, if result is

  1   2   >