[jira] [Comment Edited] (SPARK-18700) getCached in HiveMetastoreCatalog not thread safe cause driver OOM

2016-12-08 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718580#comment-15718580 ] Li Yuanjian edited comment on SPARK-18700 at 12/9/16 4:47 AM: -- Give a PR for

[jira] [Updated] (SPARK-18700) getCached in HiveMetastoreCatalog not thread safe cause driver OOM

2016-12-03 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-18700: Description: In our spark sql platform, each query use same HiveContext and independent

[jira] [Created] (SPARK-18700) getCached in HiveMetastoreCatalog not thread safe cause driver OOM

2016-12-03 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-18700: --- Summary: getCached in HiveMetastoreCatalog not thread safe cause driver OOM Key: SPARK-18700 URL: https://issues.apache.org/jira/browse/SPARK-18700 Project: Spark

[jira] [Commented] (SPARK-18700) getCached in HiveMetastoreCatalog not thread safe cause driver OOM

2016-12-03 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718580#comment-15718580 ] Li Yuanjian commented on SPARK-18700: - I'll add PR for this soon, add ReadWriteLock for each table's

[jira] [Updated] (SPARK-18700) getCached in HiveMetastoreCatalog not thread safe cause driver OOM

2016-12-19 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-18700: Affects Version/s: 2.1.1 > getCached in HiveMetastoreCatalog not thread safe cause driver OOM >

[jira] [Updated] (SPARK-20408) Get glob path in parallel to reduce resolve relation time

2017-04-20 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-20408: Summary: Get glob path in parallel to reduce resolve relation time (was: Get glob path in

[jira] [Created] (SPARK-20408) Get glob path in parallel to boost resolve relation time

2017-04-20 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-20408: --- Summary: Get glob path in parallel to boost resolve relation time Key: SPARK-20408 URL: https://issues.apache.org/jira/browse/SPARK-20408 Project: Spark Issue

[jira] [Created] (SPARK-21560) Add hold mode for the LiveListenerBus

2017-07-28 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-21560: --- Summary: Add hold mode for the LiveListenerBus Key: SPARK-21560 URL: https://issues.apache.org/jira/browse/SPARK-21560 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-21435) Empty files should be skipped while write to file

2017-07-17 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089442#comment-16089442 ] Li Yuanjian commented on SPARK-21435: - [~sowen] I tested the patch in our scenario and add a UT, I

[jira] [Created] (SPARK-21435) Empty files should be skipped while write to file

2017-07-17 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-21435: --- Summary: Empty files should be skipped while write to file Key: SPARK-21435 URL: https://issues.apache.org/jira/browse/SPARK-21435 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21560) Add hold mode for the LiveListenerBus

2017-07-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106030#comment-16106030 ] Li Yuanjian commented on SPARK-21560: - Something wrong with the sync between PR and JIRA, the PR

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-08 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119265#comment-16119265 ] Li Yuanjian commented on SPARK-18838: - I'm facing the same problem with [~milesc] {quote} We do not

[jira] [Created] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-19 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-22074: --- Summary: Task killed by other attempt task should not be resubmitted Key: SPARK-22074 URL: https://issues.apache.org/jira/browse/SPARK-22074 Project: Spark

[jira] [Updated] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-19 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-22074: Description: When a task killed by other task attempt, the task still resubmitted while its

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180652#comment-16180652 ] Li Yuanjian commented on SPARK-22074: - Hi [~jerryshao], thanks for you comment. In my scenario, the

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181943#comment-16181943 ] Li Yuanjian commented on SPARK-22074: - Hi [~jerryshao] saisai, the 66.0 resubmitted because of its

[jira] [Comment Edited] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181943#comment-16181943 ] Li Yuanjian edited comment on SPARK-22074 at 9/27/17 4:46 AM: -- Hi

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182095#comment-16182095 ] Li Yuanjian commented on SPARK-22074: - Yes, that's right. > Task killed by other attempt task should

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-24 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265109#comment-16265109 ] Li Yuanjian commented on SPARK-2926: Yes, only the reduce stage. You're right, I shouldn't only pay

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-24 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265030#comment-16265030 ] Li Yuanjian commented on SPARK-2926: [~jerryshao], thanks a lot for your advise and reply. {quote}

[jira] [Updated] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-12-13 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-2926: --- Attachment: Spark Shuffle Test Report on Spark2.x.pdf [~jerryshao] Hi saisai, thanks for your advise,

[jira] [Created] (SPARK-22546) Allow users to update the dataType of a column

2017-11-17 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-22546: --- Summary: Allow users to update the dataType of a column Key: SPARK-22546 URL: https://issues.apache.org/jira/browse/SPARK-22546 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251398#comment-16251398 ] Li Yuanjian commented on SPARK-2926: During our work of migrating some old Hadoop job to Spark, I

[jira] [Issue Comment Deleted] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-2926: --- Comment: was deleted (was: The follow up work for SortShuffleReader in current master branch, detail

[jira] [Updated] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-2926: --- Attachment: SortBasedShuffleReader on Spark 2.x.pdf The follow up work for SortShuffleReader in

[jira] [Comment Edited] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251398#comment-16251398 ] Li Yuanjian edited comment on SPARK-2926 at 11/14/17 1:54 PM: -- During our

[jira] [Comment Edited] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251398#comment-16251398 ] Li Yuanjian edited comment on SPARK-2926 at 11/14/17 1:53 PM: -- During our

[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-14 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16251457#comment-16251457 ] Li Yuanjian commented on SPARK-2926: I just giving a preview PR above, I'll collect more suggestions

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-11-09 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245556#comment-16245556 ] Li Yuanjian commented on SPARK-20928: - Our team discuss on the design sketch in detail, we have some

[jira] [Created] (SPARK-22753) Get rid of dataSource.writeAndRead

2017-12-11 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-22753: --- Summary: Get rid of dataSource.writeAndRead Key: SPARK-22753 URL: https://issues.apache.org/jira/browse/SPARK-22753 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-24036) Stateful operators in continuous processing

2018-05-10 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470067#comment-16470067 ] Li Yuanjian commented on SPARK-24036: - I agree with the division about the kinds of tasks, that's

[jira] [Updated] (SPARK-24235) create the top-of-task RDD sending rows to the remote buffer

2018-05-10 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-24235: Description:

[jira] [Updated] (SPARK-24235) create the top-of-task RDD sending rows to the remote buffer

2018-05-10 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-24235: Description:

[jira] [Updated] (SPARK-24235) create the top-of-task RDD sending rows to the remote buffer

2018-05-10 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-24235: Description:

[jira] [Commented] (SPARK-24036) Stateful operators in continuous processing

2018-05-09 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469830#comment-16469830 ] Li Yuanjian commented on SPARK-24036: - Hi [~joseph.torres] Thanks for cc me, looks great!  My doc

[jira] [Comment Edited] (SPARK-24036) Stateful operators in continuous processing

2018-05-09 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469830#comment-16469830 ] Li Yuanjian edited comment on SPARK-24036 at 5/10/18 2:32 AM: -- Hi

[jira] [Issue Comment Deleted] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-05-11 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23128: Comment: was deleted (was: I collected some user cases and performance improve effect during Baidu

[jira] [Commented] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-05-11 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471811#comment-16471811 ] Li Yuanjian commented on SPARK-23128: - I collected some user cases and performance improve effect

[jira] [Commented] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-05-11 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471812#comment-16471812 ] Li Yuanjian commented on SPARK-23128: - I collected some user cases and performance improve effect

[jira] [Updated] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-05-11 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23128: Attachment: AdaptiveExecutioninBaidu.pdf > A new approach to do adaptive execution in Spark SQL >

[jira] [Created] (SPARK-24304) Scheduler changes for continuous processing shuffle support

2018-05-17 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-24304: --- Summary: Scheduler changes for continuous processing shuffle support Key: SPARK-24304 URL: https://issues.apache.org/jira/browse/SPARK-24304 Project: Spark

[jira] [Commented] (SPARK-24293) Serialized shuffle supports mapSideCombine

2018-05-16 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478406#comment-16478406 ] Li Yuanjian commented on SPARK-24293: - {quote} doing the map side combine manually with SQL

[jira] [Commented] (SPARK-24499) Documentation improvement of Spark core and SQL

2018-06-08 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506285#comment-16506285 ] Li Yuanjian commented on SPARK-24499: - No problem, thanks for ping me, our pleasure. We'll collect

[jira] [Commented] (SPARK-24183) add unit tests for ContinuousDataReader hook

2018-06-07 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504614#comment-16504614 ] Li Yuanjian commented on SPARK-24183: - Hi [~joseph.torres], I notice currently we already have

[jira] [Commented] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-06-06 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504171#comment-16504171 ] Li Yuanjian commented on SPARK-24375: - Got it, great thanks for your detailed explanation. > Design

[jira] [Commented] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-06-06 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503486#comment-16503486 ] Li Yuanjian commented on SPARK-24375: - Hi [~cloud_fan] and [~jiangxb1987], just I tiny question

[jira] [Comment Edited] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-06-06 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503486#comment-16503486 ] Li Yuanjian edited comment on SPARK-24375 at 6/6/18 3:55 PM: - Hi

[jira] [Commented] (SPARK-24210) incorrect handling of boolean expressions when using column in expressions in pyspark.sql.DataFrame filter function

2018-06-04 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500383#comment-16500383 ] Li Yuanjian commented on SPARK-24210: - I think it maybe not a bug. #KO: returns r1 and

[jira] [Comment Edited] (SPARK-24210) incorrect handling of boolean expressions when using column in expressions in pyspark.sql.DataFrame filter function

2018-06-04 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500383#comment-16500383 ] Li Yuanjian edited comment on SPARK-24210 at 6/4/18 3:34 PM: - I think it

[jira] [Commented] (SPARK-24630) SPIP: Support SQLStreaming in Spark

2018-06-25 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523064#comment-16523064 ] Li Yuanjian commented on SPARK-24630: - cc [~zsxwing] and [~tdas] We have some practice over

[jira] [Created] (SPARK-24665) Add SQLConf in PySpark to manage all sql configs

2018-06-26 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-24665: --- Summary: Add SQLConf in PySpark to manage all sql configs Key: SPARK-24665 URL: https://issues.apache.org/jira/browse/SPARK-24665 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2018-04-26 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453591#comment-16453591 ] Li Yuanjian commented on SPARK-21661: - [~hyukjin.kwon] Hi Hyukjin, do you think we should backport 

[jira] [Closed] (SPARK-24108) ChunkedByteBuffer.writeFully method has not reset the limit value

2018-04-29 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian closed SPARK-24108. --- Duplicated submit for  [SPARK-24107|https://issues.apache.org/jira/browse/SPARK-24107], just close it.

[jira] [Created] (SPARK-22956) Union Stream Failover Cause `IllegalStateException`

2018-01-04 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-22956: --- Summary: Union Stream Failover Cause `IllegalStateException` Key: SPARK-22956 URL: https://issues.apache.org/jira/browse/SPARK-22956 Project: Spark Issue

[jira] [Created] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-24989: --- Summary: BlockFetcher should retry while getting OutOfDirectMemoryError Key: SPARK-24989 URL: https://issues.apache.org/jira/browse/SPARK-24989 Project: Spark

[jira] [Updated] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-24989: Attachment: FailedStage.png > BlockFetcher should retry while getting OutOfDirectMemoryError >

[jira] [Resolved] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-03 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian resolved SPARK-24989. - Resolution: Not A Problem The param `spark.reducer.maxBlocksInFlightPerAddress` added in

[jira] [Created] (SPARK-25077) Delete unused variable in WindowExec

2018-08-09 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-25077: --- Summary: Delete unused variable in WindowExec Key: SPARK-25077 URL: https://issues.apache.org/jira/browse/SPARK-25077 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-25100) Using KryoSerializer and setting registrationRequired true can lead job failed

2018-08-15 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-25100: Description: When spark.serializer is `org.apache.spark.serializer.KryoSerializer` and 

[jira] [Updated] (SPARK-25104) Validate user specified output schema

2018-08-15 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-25104: Description: With code changes in 

[jira] [Commented] (SPARK-25050) Handle more than two types in avro union types

2018-08-21 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587326#comment-16587326 ] Li Yuanjian commented on SPARK-25050: - Has more than tow types in avro union types been supported in

[jira] [Commented] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-08-21 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587348#comment-16587348 ] Li Yuanjian commented on SPARK-23128: - Thanks for your comment [~xinyao]. {quote} If I understand

[jira] [Commented] (SPARK-25072) PySpark custom Row class can be given extra parameters

2018-08-18 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584705#comment-16584705 ] Li Yuanjian commented on SPARK-25072: - Interesting issue, but maybe this only in PySpark, Scala Row

[jira] [Commented] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-07-21 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551638#comment-16551638 ] Li Yuanjian commented on SPARK-23128: - [~tgraves] Thanks for your comment, as far as I know

[jira] [Commented] (SPARK-24295) Purge Structured streaming FileStreamSinkLog metadata compact file data.

2018-07-15 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544453#comment-16544453 ] Li Yuanjian commented on SPARK-24295: - Could you give more detailed information about how the

[jira] [Commented] (SPARK-24295) Purge Structured streaming FileStreamSinkLog metadata compact file data.

2018-07-18 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547873#comment-16547873 ] Li Yuanjian commented on SPARK-24295: - Thanks for your detailed explain. You can check this:

[jira] [Commented] (SPARK-24340) Clean up non-shuffle disk block manager files following executor death

2018-07-21 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551687#comment-16551687 ] Li Yuanjian commented on SPARK-24340: - cc [~jiangxb1987] I think this was resolved by your pr 21390,

[jira] [Commented] (SPARK-24755) Executor loss can cause task to not be resubmitted

2018-07-08 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536107#comment-16536107 ] Li Yuanjian commented on SPARK-24755: - No problem, thanks [~hthuynh2]. Thanks [~mridulm80] for

[jira] [Updated] (SPARK-23811) FetchFailed comes before Success of same task will cause child stage never succeed

2018-03-30 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23811: Description: This is a bug caused by abnormal scenario describe below: # ShuffleMapTask 1.0

[jira] [Commented] (SPARK-23811) Same tasks' FetchFailed event comes before Success will cause child stage never succeed

2018-03-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418434#comment-16418434 ] Li Yuanjian commented on SPARK-23811: -   The scenario can be reproduced by below test case added in

[jira] [Updated] (SPARK-23811) Same tasks' FetchFailed event comes before Success will cause child stage never succeed

2018-03-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23811: Attachment: 1.png > Same tasks' FetchFailed event comes before Success will cause child stage >

[jira] [Created] (SPARK-23811) Same tasks' FetchFailed event comes before Success will cause child stage never succeed

2018-03-28 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-23811: --- Summary: Same tasks' FetchFailed event comes before Success will cause child stage never succeed Key: SPARK-23811 URL: https://issues.apache.org/jira/browse/SPARK-23811

[jira] [Updated] (SPARK-23811) Same tasks' FetchFailed event comes before Success will cause child stage never succeed

2018-03-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23811: Attachment: 2.png > Same tasks' FetchFailed event comes before Success will cause child stage >

[jira] [Updated] (SPARK-23811) FetchFailed comes before Success of same task will cause child stage never succeed

2018-03-29 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23811: Summary: FetchFailed comes before Success of same task will cause child stage never succeed (was:

[jira] [Updated] (SPARK-23533) Add support for changing ContinuousDataReader's startOffset

2018-02-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-23533: Summary: Add support for changing ContinuousDataReader's startOffset (was: Add support for

[jira] [Created] (SPARK-23533) Add support for changing ContinousDataReader's startOffset

2018-02-27 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-23533: --- Summary: Add support for changing ContinousDataReader's startOffset Key: SPARK-23533 URL: https://issues.apache.org/jira/browse/SPARK-23533 Project: Spark

[jira] [Commented] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2018-04-26 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454070#comment-16454070 ] Li Yuanjian commented on SPARK-21661: - Got it. > SparkSQL can't merge load table from Hadoop >

[jira] [Commented] (SPARK-22565) Session-based windowing

2018-09-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631300#comment-16631300 ] Li Yuanjian commented on SPARK-22565: - Also cc [~zsxwing] [~tdas], we are translating the design doc

[jira] [Updated] (SPARK-22565) Session-based windowing

2018-09-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-22565: Attachment: screenshot-1.png > Session-based windowing > --- > >

[jira] [Comment Edited] (SPARK-22565) Session-based windowing

2018-09-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631298#comment-16631298 ] Li Yuanjian edited comment on SPARK-22565 at 9/28/18 3:23 AM: -- Thanks for

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631398#comment-16631398 ] Li Yuanjian commented on SPARK-10816: - Great thanks for [~kabhwan] notice me, just linked

[jira] [Commented] (SPARK-22565) Session-based windowing

2018-09-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631298#comment-16631298 ] Li Yuanjian commented on SPARK-22565: - Thanks for reporting this. Actually we also met this problem

[jira] [Commented] (SPARK-22565) Session-based windowing

2018-09-27 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631370#comment-16631370 ] Li Yuanjian commented on SPARK-22565: - [~kabhwan] Great thanks for noticing me, sorry for only

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-10-11 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646177#comment-16646177 ] Li Yuanjian commented on SPARK-10816: - Thanks [~zsxwing] for your comment and discussion, great

[jira] [Commented] (SPARK-24499) Documentation improvement of Spark core and SQL

2018-09-23 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625061#comment-16625061 ] Li Yuanjian commented on SPARK-24499: - [~smilegator] Sorry for the late reply for this, we'll give

[jira] [Commented] (SPARK-25527) Job stuck waiting for last stage to start

2018-09-25 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627593#comment-16627593 ] Li Yuanjian commented on SPARK-25527: - {quote} There are no Tasks waiting for completion, and the

[jira] [Updated] (SPARK-10816) EventTime based sessionization

2018-09-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-10816: Attachment: Session Window Support For Structure Streaming.pdf > EventTime based sessionization >

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-28 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631751#comment-16631751 ] Li Yuanjian commented on SPARK-10816: - Design doc:

[jira] [Commented] (SPARK-25426) Remove the duplicate fallback logic in UnsafeProjection

2018-09-17 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617587#comment-16617587 ] Li Yuanjian commented on SPARK-25426: - Resolved by https://github.com/apache/spark/pull/22417. >