[jira] [Updated] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-08-23 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-23207: - Affects Version/s: 1.6.0 2.0.0 2.1.0

[jira] [Commented] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-21 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587020#comment-16587020 ] Jiang Xingbo commented on SPARK-25114: -- The PR has been merged to master and 2.3 >

[jira] [Created] (SPARK-25161) Fix several bugs in failure handling of barrier execution mode

2018-08-20 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25161: Summary: Fix several bugs in failure handling of barrier execution mode Key: SPARK-25161 URL: https://issues.apache.org/jira/browse/SPARK-25161 Project: Spark

[jira] [Commented] (SPARK-24941) Add RDDBarrier.coalesce() function

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579930#comment-16579930 ] Jiang Xingbo commented on SPARK-24941: -- Shall we add something like `spark.default.parallelism`? It

[jira] [Commented] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579879#comment-16579879 ] Jiang Xingbo commented on SPARK-25114: -- I created https://github.com/apache/spark/pull/22101 for

[jira] [Updated] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-25114: - Labels: correctness (was: ) > RecordBinaryComparator may return wrong result when subtraction

[jira] [Updated] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-25114: - Priority: Blocker (was: Major) > RecordBinaryComparator may return wrong result when

[jira] [Created] (SPARK-25114) RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE

2018-08-14 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25114: Summary: RecordBinaryComparator may return wrong result when subtraction between two words is divisible by Integer.MAX_VALUE Key: SPARK-25114 URL:

[jira] [Created] (SPARK-25095) Python support for BarrierTaskContext

2018-08-12 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25095: Summary: Python support for BarrierTaskContext Key: SPARK-25095 URL: https://issues.apache.org/jira/browse/SPARK-25095 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-08-09 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574903#comment-16574903 ] Jiang Xingbo commented on SPARK-23207: -- This affects the 2.2 and lower versions, the reason why we

[jira] [Created] (SPARK-25074) Implement maxNumConcurrentTasks() in MesosFineGrainedSchedulerBackend

2018-08-09 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25074: Summary: Implement maxNumConcurrentTasks() in MesosFineGrainedSchedulerBackend Key: SPARK-25074 URL: https://issues.apache.org/jira/browse/SPARK-25074 Project: Spark

[jira] [Created] (SPARK-25045) Make `RDDBarrier.mapParititions` similar to `RDD.mapPartitions`

2018-08-07 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25045: Summary: Make `RDDBarrier.mapParititions` similar to `RDD.mapPartitions` Key: SPARK-25045 URL: https://issues.apache.org/jira/browse/SPARK-25045 Project: Spark

[jira] [Created] (SPARK-25030) SparkSubmit will not return result if the mainClass submitted creates a Timer()

2018-08-06 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25030: Summary: SparkSubmit will not return result if the mainClass submitted creates a Timer() Key: SPARK-25030 URL: https://issues.apache.org/jira/browse/SPARK-25030

[jira] [Commented] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-08-04 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569202#comment-16569202 ] Jiang Xingbo commented on SPARK-24375: -- [~mridulm80] You are right that now we are not able to

[jira] [Created] (SPARK-25017) Add test suite for ContextBarrierState

2018-08-03 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-25017: Summary: Add test suite for ContextBarrierState Key: SPARK-25017 URL: https://issues.apache.org/jira/browse/SPARK-25017 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-24884) Implement regexp_extract_all

2018-08-02 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566752#comment-16566752 ] Jiang Xingbo commented on SPARK-24884: -- You don't need to be assigned, just prepare and submit a

[jira] [Commented] (SPARK-24817) Implement BarrierTaskContext.barrier()

2018-08-01 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566278#comment-16566278 ] Jiang Xingbo commented on SPARK-24817: -- Actually the current implementation of _barrier_ function

[jira] [Resolved] (SPARK-24582) Design: Barrier execution mode

2018-07-30 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo resolved SPARK-24582. -- Resolution: Fixed > Design: Barrier execution mode > -- > >

[jira] [Resolved] (SPARK-24581) Design: BarrierTaskContext.barrier()

2018-07-30 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo resolved SPARK-24581. -- Resolution: Fixed > Design: BarrierTaskContext.barrier() >

[jira] [Created] (SPARK-24954) Fail fast on job submit if run a barrier stage with dynamic resource allocation enabled

2018-07-27 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24954: Summary: Fail fast on job submit if run a barrier stage with dynamic resource allocation enabled Key: SPARK-24954 URL: https://issues.apache.org/jira/browse/SPARK-24954

[jira] [Updated] (SPARK-24942) Improve cluster resource management with jobs containing barrier stage

2018-07-27 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-24942: - Target Version/s: 3.0.0 > Improve cluster resource management with jobs containing barrier

[jira] [Updated] (SPARK-24941) Add RDDBarrier.coalesce() function

2018-07-27 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-24941: - Target Version/s: 3.0.0 > Add RDDBarrier.coalesce() function >

[jira] [Created] (SPARK-24942) Improve cluster resource management with jobs containing barrier stage

2018-07-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24942: Summary: Improve cluster resource management with jobs containing barrier stage Key: SPARK-24942 URL: https://issues.apache.org/jira/browse/SPARK-24942 Project:

[jira] [Created] (SPARK-24941) Add RDDBarrier.coalesce() function

2018-07-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24941: Summary: Add RDDBarrier.coalesce() function Key: SPARK-24941 URL: https://issues.apache.org/jira/browse/SPARK-24941 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-24581) Design: BarrierTaskContext.barrier()

2018-07-24 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554409#comment-16554409 ] Jiang Xingbo commented on SPARK-24581: -- Design doc:

[jira] [Resolved] (SPARK-24340) Clean up non-shuffle disk block manager files following executor death

2018-07-21 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo resolved SPARK-24340. -- Resolution: Fixed > Clean up non-shuffle disk block manager files following executor death >

[jira] [Commented] (SPARK-24340) Clean up non-shuffle disk block manager files following executor death

2018-07-21 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551714#comment-16551714 ] Jiang Xingbo commented on SPARK-24340: -- Thanks~ > Clean up non-shuffle disk block manager files

[jira] [Created] (SPARK-24877) Ignore the task completion event from a zombie barrier task

2018-07-20 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24877: Summary: Ignore the task completion event from a zombie barrier task Key: SPARK-24877 URL: https://issues.apache.org/jira/browse/SPARK-24877 Project: Spark

[jira] [Created] (SPARK-24874) Allow hybrid of both barrier tasks and regular tasks in a stage

2018-07-20 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24874: Summary: Allow hybrid of both barrier tasks and regular tasks in a stage Key: SPARK-24874 URL: https://issues.apache.org/jira/browse/SPARK-24874 Project: Spark

[jira] [Commented] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-07-18 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548784#comment-16548784 ] Jiang Xingbo commented on SPARK-24375: -- {quote}Is the 'barrier' logic pluggable ? Instead of only

[jira] [Created] (SPARK-24824) Make Spark task speculation a per-stage config

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24824: Summary: Make Spark task speculation a per-stage config Key: SPARK-24824 URL: https://issues.apache.org/jira/browse/SPARK-24824 Project: Spark Issue Type:

[jira] [Created] (SPARK-24823) Cancel a job that contains barrier stage(s) if the barrier tasks don't get launched within a configured time

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24823: Summary: Cancel a job that contains barrier stage(s) if the barrier tasks don't get launched within a configured time Key: SPARK-24823 URL:

[jira] [Created] (SPARK-24822) Python support for barrier execution mode

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24822: Summary: Python support for barrier execution mode Key: SPARK-24822 URL: https://issues.apache.org/jira/browse/SPARK-24822 Project: Spark Issue Type: New

[jira] [Created] (SPARK-24821) Fail fast when submitted job compute on a subset of all the partitions for a barrier stage

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24821: Summary: Fail fast when submitted job compute on a subset of all the partitions for a barrier stage Key: SPARK-24821 URL: https://issues.apache.org/jira/browse/SPARK-24821

[jira] [Created] (SPARK-24820) Fail fast when submitted job contains PartitionPruningRDD in a barrier stage

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24820: Summary: Fail fast when submitted job contains PartitionPruningRDD in a barrier stage Key: SPARK-24820 URL: https://issues.apache.org/jira/browse/SPARK-24820

[jira] [Created] (SPARK-24819) Fail fast when no enough slots to launch the barrier stage on job submitted

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24819: Summary: Fail fast when no enough slots to launch the barrier stage on job submitted Key: SPARK-24819 URL: https://issues.apache.org/jira/browse/SPARK-24819 Project:

[jira] [Created] (SPARK-24818) Ensure all the barrier tasks in the same stage are launched together

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24818: Summary: Ensure all the barrier tasks in the same stage are launched together Key: SPARK-24818 URL: https://issues.apache.org/jira/browse/SPARK-24818 Project: Spark

[jira] [Created] (SPARK-24817) Implement BarrierTaskContext.barrier()

2018-07-16 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24817: Summary: Implement BarrierTaskContext.barrier() Key: SPARK-24817 URL: https://issues.apache.org/jira/browse/SPARK-24817 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-24795) Implement barrier execution mode

2018-07-12 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-24795: - Description: Implement barrier execution mode, as described in SPARK-24582 Include all the API

[jira] [Created] (SPARK-24795) Implement barrier execution mode

2018-07-12 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24795: Summary: Implement barrier execution mode Key: SPARK-24795 URL: https://issues.apache.org/jira/browse/SPARK-24795 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-24582) Design: Barrier execution mode

2018-07-08 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536331#comment-16536331 ] Jiang Xingbo commented on SPARK-24582: -- Design doc: 

[jira] [Created] (SPARK-24564) Add test suite for RecordBinaryComparator

2018-06-14 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24564: Summary: Add test suite for RecordBinaryComparator Key: SPARK-24564 URL: https://issues.apache.org/jira/browse/SPARK-24564 Project: Spark Issue Type: Test

[jira] [Comment Edited] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511695#comment-16511695 ] Jiang Xingbo edited comment on SPARK-24552 at 6/13/18 9:47 PM: --- IIUC

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511695#comment-16511695 ] Jiang Xingbo commented on SPARK-24552: -- IIUC stageAttemptId + taskAttemptId shall probably define a

[jira] [Commented] (SPARK-24387) Heartbeat-timeout executor is added back and used again

2018-06-11 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508832#comment-16508832 ] Jiang Xingbo commented on SPARK-24387: -- {quote}So I think there's a race condition that the backend

[jira] [Commented] (SPARK-24492) Endless attempted task when TaskCommitDenied exception writing to S3A

2018-06-08 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506232#comment-16506232 ] Jiang Xingbo commented on SPARK-24492: -- It is possible that one task attempt acquired the

[jira] [Commented] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-06-06 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503647#comment-16503647 ] Jiang Xingbo commented on SPARK-24375: -- The major problem is that tasks in the same stage of a MPI

[jira] [Commented] (SPARK-24375) Design sketch: support barrier scheduling in Apache Spark

2018-05-24 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489764#comment-16489764 ] Jiang Xingbo commented on SPARK-24375: -- We proposal to add new RDDBarrier and BarrierTaskContext to

[jira] [Created] (SPARK-24340) Clean up non-shuffle disk block manager files following executor death

2018-05-22 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24340: Summary: Clean up non-shuffle disk block manager files following executor death Key: SPARK-24340 URL: https://issues.apache.org/jira/browse/SPARK-24340 Project:

[jira] [Created] (SPARK-23881) Flaky test: JobCancellationSuite."interruptible iterator of shuffle reader"

2018-04-06 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23881: Summary: Flaky test: JobCancellationSuite."interruptible iterator of shuffle reader" Key: SPARK-23881 URL: https://issues.apache.org/jira/browse/SPARK-23881 Project:

[jira] [Commented] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table

2018-02-27 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16378546#comment-16378546 ] Jiang Xingbo commented on SPARK-23525: -- I'm working on a fix for it, and will try to backport the

[jira] [Commented] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table

2018-02-27 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16378533#comment-16378533 ] Jiang Xingbo commented on SPARK-23525: -- Thank you for reporting this. I believe the bug is caused

[jira] [Updated] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table

2018-02-27 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-23525: - Affects Version/s: 2.3.0 Priority: Major (was: Minor) Summary: ALTER

[jira] [Comment Edited] (SPARK-23139) Read eventLog file with mixed encodings

2018-02-07 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356507#comment-16356507 ] Jiang Xingbo edited comment on SPARK-23139 at 2/8/18 5:25 AM: --

[jira] [Commented] (SPARK-23139) Read eventLog file with mixed encodings

2018-02-07 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356507#comment-16356507 ] Jiang Xingbo commented on SPARK-23139: -- ``` EventLog may contain mixed encodings such as custom

[jira] [Created] (SPARK-23330) Spark UI SQL executions page throws NPE

2018-02-03 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23330: Summary: Spark UI SQL executions page throws NPE Key: SPARK-23330 URL: https://issues.apache.org/jira/browse/SPARK-23330 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-23243) Shuffle+Repartition on an RDD could lead to incorrect answers

2018-01-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23243: Summary: Shuffle+Repartition on an RDD could lead to incorrect answers Key: SPARK-23243 URL: https://issues.apache.org/jira/browse/SPARK-23243 Project: Spark

[jira] [Updated] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-01-26 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-23207: - Summary: Shuffle+Repartition on an DataFrame could lead to incorrect answers (was:

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-24 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338419#comment-16338419 ] Jiang Xingbo commented on SPARK-23207: -- I'm working on this. > Shuffle+Repartition on an

[jira] [Created] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-24 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23207: Summary: Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss Key: SPARK-23207 URL: https://issues.apache.org/jira/browse/SPARK-23207 Project: Spark

[jira] [Created] (SPARK-23188) Make vectorized columar reader batch size configurable

2018-01-23 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23188: Summary: Make vectorized columar reader batch size configurable Key: SPARK-23188 URL: https://issues.apache.org/jira/browse/SPARK-23188 Project: Spark Issue

[jira] [Commented] (SPARK-22360) Add unit test for Window Specifications

2018-01-19 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332773#comment-16332773 ] Jiang Xingbo commented on SPARK-22360: -- Created https://issues.apache.org/jira/browse/SPARK-23160 >

[jira] [Created] (SPARK-23160) Add more window sql tests

2018-01-19 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23160: Summary: Add more window sql tests Key: SPARK-23160 URL: https://issues.apache.org/jira/browse/SPARK-23160 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-22360) Add unit test for Window Specifications

2018-01-19 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332762#comment-16332762 ] Jiang Xingbo commented on SPARK-22360: -- Sorry for late response. It's great that we can cover the

[jira] [Commented] (SPARK-22297) Flaky test: BlockManagerSuite "Shuffle registration timeout and maxAttempts conf"

2018-01-01 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307501#comment-16307501 ] Jiang Xingbo commented on SPARK-22297: -- How often do we run into this? Personally I can't repro this

[jira] [Commented] (SPARK-22359) Improve the test coverage of window functions

2017-12-13 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289363#comment-16289363 ] Jiang Xingbo commented on SPARK-22359: -- [~smurakozi] Please feel free to PR for this. > Improve the

[jira] [Commented] (SPARK-22757) Init-container in the driver/executor pods for downloading remote dependencies

2017-12-13 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288856#comment-16288856 ] Jiang Xingbo commented on SPARK-22757: -- Is this also targeted to 2.3 release? > Init-container in

[jira] [Commented] (SPARK-22680) SparkSQL scan all partitions when the specified partitions are not exists in parquet formatted table

2017-12-06 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281235#comment-16281235 ] Jiang Xingbo commented on SPARK-22680: -- Could you also post the result of EXPLAIN? Thanks! >

[jira] [Created] (SPARK-22363) Add unit test for Window spilling

2017-10-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-22363: Summary: Add unit test for Window spilling Key: SPARK-22363 URL: https://issues.apache.org/jira/browse/SPARK-22363 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-22362) Add unit test for Window Aggregate Functions

2017-10-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-22362: Summary: Add unit test for Window Aggregate Functions Key: SPARK-22362 URL: https://issues.apache.org/jira/browse/SPARK-22362 Project: Spark Issue Type:

[jira] [Created] (SPARK-22361) Add unit test for Window Frames

2017-10-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-22361: Summary: Add unit test for Window Frames Key: SPARK-22361 URL: https://issues.apache.org/jira/browse/SPARK-22361 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-22360) Add unit test for Window Specifications

2017-10-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-22360: Summary: Add unit test for Window Specifications Key: SPARK-22360 URL: https://issues.apache.org/jira/browse/SPARK-22360 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-22359) Improve the test coverage of window functions

2017-10-26 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-22359: Summary: Improve the test coverage of window functions Key: SPARK-22359 URL: https://issues.apache.org/jira/browse/SPARK-22359 Project: Spark Issue Type:

[jira] [Created] (SPARK-22214) Refactor the list hive partitions code

2017-10-06 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-22214: Summary: Refactor the list hive partitions code Key: SPARK-22214 URL: https://issues.apache.org/jira/browse/SPARK-22214 Project: Spark Issue Type:

[jira] [Created] (SPARK-21608) Window rangeBetween() API should allow literal boundary

2017-08-02 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-21608: Summary: Window rangeBetween() API should allow literal boundary Key: SPARK-21608 URL: https://issues.apache.org/jira/browse/SPARK-21608 Project: Spark

[jira] [Created] (SPARK-21496) Support codegen for TakeOrderedAndProjectExec

2017-07-20 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-21496: Summary: Support codegen for TakeOrderedAndProjectExec Key: SPARK-21496 URL: https://issues.apache.org/jira/browse/SPARK-21496 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21410) In RangePartitioner(partitions: Int, rdd: RDD[]), RangePartitioner.numPartitions is wrong if the number of elements in RDD (rdd.count()) is less than number of partiti

2017-07-16 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088970#comment-16088970 ] Jiang Xingbo commented on SPARK-21410: -- This is not a bug since it doesn't generate any wrong

[jira] [Updated] (SPARK-21410) In RangePartitioner(partitions: Int, rdd: RDD[]), RangePartitioner.numPartitions is wrong if the number of elements in RDD (rdd.count()) is less than number of partition

2017-07-16 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-21410: - Issue Type: Improvement (was: Bug) > In RangePartitioner(partitions: Int, rdd: RDD[]), >

[jira] [Commented] (SPARK-14151) Propose to refactor and expose Metrics Sink and Source interface

2017-07-11 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082283#comment-16082283 ] Jiang Xingbo commented on SPARK-14151: -- Since this is purposed to add a set of public API, it would

[jira] [Commented] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-10 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081518#comment-16081518 ] Jiang Xingbo commented on SPARK-21349: -- [~dongjoon] Are you running the test for Spark SQL? Or

[jira] [Created] (SPARK-21366) Add sql test for window functions

2017-07-10 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-21366: Summary: Add sql test for window functions Key: SPARK-21366 URL: https://issues.apache.org/jira/browse/SPARK-21366 Project: Spark Issue Type: Task

[jira] [Updated] (SPARK-19451) rangeBetween method should accept Long value as boundary

2017-07-05 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-19451: - Summary: rangeBetween method should accept Long value as boundary (was: Long values in Window

[jira] [Created] (SPARK-21260) Remove the unused OutputFakerExec

2017-06-29 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-21260: Summary: Remove the unused OutputFakerExec Key: SPARK-21260 URL: https://issues.apache.org/jira/browse/SPARK-21260 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-21225) decrease the Mem using for variable 'tasks' in function resourceOffers

2017-06-28 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-21225: - Issue Type: Bug (was: Improvement) > decrease the Mem using for variable 'tasks' in function

[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer

2017-06-12 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047229#comment-16047229 ] Jiang Xingbo commented on SPARK-18294: -- This is actually legacy code refactoring, it shouldn't

[jira] [Created] (SPARK-20989) Fail to start multiple workers on one host if external shuffle service is enabled in standalone mode

2017-06-05 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-20989: Summary: Fail to start multiple workers on one host if external shuffle service is enabled in standalone mode Key: SPARK-20989 URL:

[jira] [Commented] (SPARK-20832) Standalone master should explicitly inform drivers of worker deaths and invalidate external shuffle service outputs

2017-05-30 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029778#comment-16029778 ] Jiang Xingbo commented on SPARK-20832: -- I'm working on this. > Standalone master should explicitly

[jira] [Comment Edited] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-16 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013199#comment-16013199 ] Jiang Xingbo edited comment on SPARK-20700 at 5/16/17 10:23 PM: In the

[jira] [Commented] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-16 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013199#comment-16013199 ] Jiang Xingbo commented on SPARK-20700: -- In the previous approach we used `aliasMap` to link an

[jira] [Commented] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-12 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008286#comment-16008286 ] Jiang Xingbo commented on SPARK-20700: -- I've reproduced this case, will dive further into it this

[jira] [Issue Comment Deleted] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-12 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-20700: - Comment: was deleted (was: I couldn't reproduce the failure on current master branch, the test

[jira] [Commented] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-12 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007731#comment-16007731 ] Jiang Xingbo commented on SPARK-20700: -- I couldn't reproduce the failure on current master branch,

[jira] [Commented] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-11 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006201#comment-16006201 ] Jiang Xingbo commented on SPARK-20700: -- I'm working on this, thank you![~joshrosen] >

[jira] [Commented] (SPARK-20680) Spark-sql do not support for void column datatype of view

2017-05-11 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006169#comment-16006169 ] Jiang Xingbo commented on SPARK-20680: -- [~hvanhovell]Sure, I'll look at this issue. > Spark-sql do

[jira] [Commented] (SPARK-20236) Overwrite a partitioned table should only overwrite related partitions

2017-04-06 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958448#comment-15958448 ] Jiang Xingbo commented on SPARK-20236: -- I‘m working on this. > Overwrite a partitioned table should

[jira] [Created] (SPARK-19960) Move `SparkHadoopWriter` to `internal/io/`

2017-03-15 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-19960: Summary: Move `SparkHadoopWriter` to `internal/io/` Key: SPARK-19960 URL: https://issues.apache.org/jira/browse/SPARK-19960 Project: Spark Issue Type:

[jira] [Updated] (SPARK-19877) Restrict the nested level of a view

2017-03-10 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiang Xingbo updated SPARK-19877: - Summary: Restrict the nested level of a view (was: Restrict the depth of view reference chains)

[jira] [Created] (SPARK-19877) Restrict the depth of view reference chains

2017-03-08 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-19877: Summary: Restrict the depth of view reference chains Key: SPARK-19877 URL: https://issues.apache.org/jira/browse/SPARK-19877 Project: Spark Issue Type:

[jira] [Commented] (SPARK-19877) Restrict the depth of view reference chains

2017-03-08 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902324#comment-15902324 ] Jiang Xingbo commented on SPARK-19877: -- I'm working on this. > Restrict the depth of view reference

[jira] [Commented] (SPARK-18389) Disallow cyclic view reference

2017-03-01 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891575#comment-15891575 ] Jiang Xingbo commented on SPARK-18389: -- I‘ve just figure out a way to work this out, will try to

  1   2   >