[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97242/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #97242 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97242/testReport)**
 for PR 22347 at commit 
[`8666272`](https://github.com/apache/spark/commit/86662722e53bfcae2c75e61d170c983abd599b3a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #97242 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97242/testReport)**
 for PR 22347 at commit 
[`8666272`](https://github.com/apache/spark/commit/86662722e53bfcae2c75e61d170c983abd599b3a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-11 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22347
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-09 Thread tooptoop4
Github user tooptoop4 commented on the issue:

https://github.com/apache/spark/pull/22347
  
@dongjoon-hyun can this be merged?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96937/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #96937 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96937/testReport)**
 for PR 22347 at commit 
[`8666272`](https://github.com/apache/spark/commit/86662722e53bfcae2c75e61d170c983abd599b3a).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue:

https://github.com/apache/spark/pull/22347
  
I added example code of issue case to the content of PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #96937 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96937/testReport)**
 for PR 22347 at commit 
[`8666272`](https://github.com/apache/spark/commit/86662722e53bfcae2c75e61d170c983abd599b3a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96870/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #96870 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96870/testReport)**
 for PR 22347 at commit 
[`a8f1481`](https://github.com/apache/spark/commit/a8f14817ce3f52f710c3341148c2e1f3374335eb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue:

https://github.com/apache/spark/pull/22347
  
Thank you for review. 
Yes, ThriftServer will use intermediate "collection view" in this PR.
And [Original PR of 
ThriftServer](https://github.com/apache/spark/pull/22219) will be updated 
accordingly ,if this PR is merged.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #96870 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96870/testReport)**
 for PR 22347 at commit 
[`a8f1481`](https://github.com/apache/spark/commit/a8f14817ce3f52f710c3341148c2e1f3374335eb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22347
  
Thank you for your first contribution, @Dooyoung-Hwang ! So, this is a 
spin-off PR from STS?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22347
  
Retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95892/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #95892 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95892/testReport)**
 for PR 22347 at commit 
[`a8f1481`](https://github.com/apache/spark/commit/a8f14817ce3f52f710c3341148c2e1f3374335eb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #95892 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95892/testReport)**
 for PR 22347 at commit 
[`a8f1481`](https://github.com/apache/spark/commit/a8f14817ce3f52f710c3341148c2e1f3374335eb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22347
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue:

https://github.com/apache/spark/pull/22347
  
Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue:

https://github.com/apache/spark/pull/22347
  
I tested in my local PC. 3.3 GHz Intel Core i5, and selected 400,000 rows x 
25 times.
I took a total execution time between decodeUnsafeRows.
My tested data is skewed, so gathered rows from executor are distributed 
between 40 and  80. 
The average execution time decreased from 175.92ms to 93.52ms.
Memory usage also improved, and total GC Time is decreased from 13.883 sec 
to 10.764 sec.

## Before Patch
### GC statics

S0 | S1 | E | O | M | CCS | YGC | YGCT | FGC | FGCT | GCT
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
0 | 100 | 24.92 | 48.66 | 96.92 | 88.25 | 150 | 13.883 | 0 | 0 | 13.883

### Wall time : AVG 175.92 ms
Row Count | Decode Time(ms)
-- | --
428942 | 73
473726 | 106
476322 | 78
509996 | 83
510590 | 124
556896 | 94
556896 | 362
595272 | 193
595272 | 175
642478 | 120
644970 | 279
679544 | 269
693354 | 116
723532 | 124
729912 | 136
730218 | 120
730246 | 184
773640 | 183
774148 | 380
810198 | 128
811606 | 131
859090 | 138
895474 | 314
895954 | 339
939636 | 149

## After Patch : 93.52ms

### GC statics
S0 | S1 | E | O | M | CCS | YGC | YGCT | FGC | FGCT | GCT
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
0 | 100 | 81.37 | 33.34 | 97.21 | 88.35 | 127 | 10.764 | 0 | 0 | 10.764

### Wall time : AVG 93.52 ms
Row Count | Decode time (ms)
-- | --
421922 | 61
422516 | 180
422850 | 110
473218 | 62
473218 | 103
473438 | 115
507198 | 60
554606 | 144
557202 | 119
601392 | 71
642652 | 61
645276 | 64
679036 | 64
679036 | 63
729624 | 242
729652 | 62
729912 | 131
773814 | 122
774234 | 62
807908 | 59
810198 | 64
814900 | 72
844772 | 59
858582 | 127
858582 | 61





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95859/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22347
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #95859 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95859/testReport)**
 for PR 22347 at commit 
[`a8f1481`](https://github.com/apache/spark/commit/a8f14817ce3f52f710c3341148c2e1f3374335eb).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22347
  
**[Test build #95859 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95859/testReport)**
 for PR 22347 at commit 
[`a8f1481`](https://github.com/apache/spark/commit/a8f14817ce3f52f710c3341148c2e1f3374335eb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22347
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22347
  
Let me leave this ok to test since there looks a progress here anyway.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-07 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22347
  
btw, do we have any actual performance benefit (wall time) from this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-07 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue:

https://github.com/apache/spark/pull/22347
  
@kiszk 
It is impossible counting decoded rows without modify SparkPlan, because 
there is no way of counting iterated size.

Instead I can simulate this patch in Scala WorkSheet with below code.

```scala
var decodeCount = 0
def decoding(buf: Array[Int]): Iterator[String] = {
  new Iterator[String] {
var remain = buf.sum
var index = 0
override def hasNext: Boolean = remain > 0
override def next(): String = {
  while (buf(index) == 0) index += 1
  buf(index) -= 1
  remain -= 1
  decodeCount += 1// increase decodeCount
  f"[decode Result:$remain]"
}
  }
}

// reset decodeCount
decodeCount = 0

// Before Patch : decode without scala view
val buf = new ArrayBuffer[String]
val inputIter = Array(Array(2, 2, 2), Array(2), Array(2)).iterator
while (inputIter.hasNext) buf ++= Array(inputIter.next()).flatMap(decoding)
val result1 = buf.take(3).toArray

// ensure decode count is 10
assert(decodeCount == 10)

// reset decodeCount
decodeCount = 0

// After Patch : decode with scala view
val result2 = ArrayBuffer(Array(2, 2, 2), Array(2), Array(2)).toArray.view
  .flatMap(decoding).take(3).force

// ensure decode count is 3
assert(decodeCount == 3)

// assert same element
assert(result1 sameElements result2)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-06 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22347
  
Thank you for your update.
Is it better to add test case to confirm the state of the internal 
structures is as you expected? @maropu


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org