subject:"\[GitHub\] spark pull request\: \[SPARK\-9879\]\[SQL\]\[WIP\] Fix OOM in Limit clause..."

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-12-17 Thread chenghao-intel

Github user chenghao-intel closed the pull request at:

https://github.com/apache/spark/pull/8128


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-12-17 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-165672100
  
Thank you @andrewor14, I will close this PR for now, and reopen it once I 
have better idea for this fixing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-12-15 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-164880475
  
Is this still an issue given all the latest memory management changes? 
@chenghao-intel are you still able to reproduce this in master?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134860821
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41596/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134860816
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134860489
  
  [Test build #41596 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41596/console)
 for   PR 8128 at commit 
[`50b33d8`](https://github.com/apache/spark/commit/50b33d80532c20c074b6973bf124b5391df17fca).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class LargeLimit(limit: Int, child: SparkPlan)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134889218
  
How about adding a lazy version of `RDD.take` which returns a `RDD` rather 
than `Array`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134902205
  
Are you talking about the `RDD.toLocalIterator`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-26 Thread Mageswaran1989

Github user Mageswaran1989 commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r38062415
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -192,6 +192,12 @@ private[spark] object SQLConf {
   column based on statistics of the data.,
 isPublic = false)
 
+  val LIMIT_ROWS = longConf(spark.sql.limit.rows,
+defaultValue = Some(10L),
+doc = For the LIMIT clause, put all of the output rows in a single 
partition  +
+  iif the required row number less than the threshold, otherwise 
fetch the rows in a  +
--- End diff --

I think iif = is typo


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-25 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134831582
  
  [Test build #41596 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41596/consoleFull)
 for   PR 8128 at commit 
[`50b33d8`](https://github.com/apache/spark/commit/50b33d80532c20c074b6973bf124b5391df17fca).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-25 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134831174
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134831408
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-134831400
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-13 Thread GraceH

Github user GraceH commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r37046803
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -224,6 +225,56 @@ case class Limit(limit: Int, child: SparkPlan)
 
 /**
  * :: DeveloperApi ::
+ * Take the first limit elements. and the limit can be any number less 
than Integer.MAX_VALUE.
+ * If it is terminal and is invoked using executeCollect, it probably 
cause OOM if the
+ * records number is large enough. Not like the Limit clause, this 
operator will not change
+ * any partitions of its child operator.
+ */
+@DeveloperApi
+case class LargeLimit(limit: Int, child: SparkPlan)
+  extends UnaryNode {
+  /** We must copy rows when sort based shuffle is on */
+  private def sortBasedShuffleOn = 
SparkEnv.get.shuffleManager.isInstanceOf[SortShuffleManager]
+
+  override def output: Seq[Attribute] = child.output
+
+  override def executeCollect(): Array[Row] = child.executeTake(limit)
+
+  protected override def doExecute(): RDD[InternalRow] = {
+val rdd = if (sortBasedShuffleOn) {
+  child.execute().map(_.copy()).persist(StorageLevel.MEMORY_AND_DISK)
+} else {
+  child.execute().persist(StorageLevel.MEMORY_AND_DISK)
+}
+
+// We assume the maximize record number in a partition is less than 
Integer.MAX_VALUE
+val partitionRecordCounts = rdd.mapPartitions({ iterator =
+  Iterator(iterator.count(_ = true))
+}, true).collect()
+
+var totalSize = 0
+// how many records we have to take from each partition
+val requiredRecordCounts = partitionRecordCounts.map { count =
--- End diff --

just minor suggestion. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-13 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r37046687
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -224,6 +225,56 @@ case class Limit(limit: Int, child: SparkPlan)
 
 /**
  * :: DeveloperApi ::
+ * Take the first limit elements. and the limit can be any number less 
than Integer.MAX_VALUE.
+ * If it is terminal and is invoked using executeCollect, it probably 
cause OOM if the
+ * records number is large enough. Not like the Limit clause, this 
operator will not change
+ * any partitions of its child operator.
+ */
+@DeveloperApi
+case class LargeLimit(limit: Int, child: SparkPlan)
+  extends UnaryNode {
+  /** We must copy rows when sort based shuffle is on */
+  private def sortBasedShuffleOn = 
SparkEnv.get.shuffleManager.isInstanceOf[SortShuffleManager]
+
+  override def output: Seq[Attribute] = child.output
+
+  override def executeCollect(): Array[Row] = child.executeTake(limit)
+
+  protected override def doExecute(): RDD[InternalRow] = {
+val rdd = if (sortBasedShuffleOn) {
+  child.execute().map(_.copy()).persist(StorageLevel.MEMORY_AND_DISK)
+} else {
+  child.execute().persist(StorageLevel.MEMORY_AND_DISK)
+}
+
+// We assume the maximize record number in a partition is less than 
Integer.MAX_VALUE
+val partitionRecordCounts = rdd.mapPartitions({ iterator =
+  Iterator(iterator.count(_ = true))
+}, true).collect()
+
+var totalSize = 0
+// how many records we have to take from each partition
+val requiredRecordCounts = partitionRecordCounts.map { count =
--- End diff --

Yes, that's true, but this probably not be a big issue for driver side, as 
it's computed only once. Maybe we can fix that with 
https://github.com/apache/spark/pull/8128/files#diff-cfe05f35a9d919bf4480ec6980e91554R269


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-13 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130911881
  
Can you describe your proposed approach in the pull request description and 
JIRA? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-13 Thread GraceH

Github user GraceH commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r37044861
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -224,6 +225,56 @@ case class Limit(limit: Int, child: SparkPlan)
 
 /**
  * :: DeveloperApi ::
+ * Take the first limit elements. and the limit can be any number less 
than Integer.MAX_VALUE.
+ * If it is terminal and is invoked using executeCollect, it probably 
cause OOM if the
+ * records number is large enough. Not like the Limit clause, this 
operator will not change
+ * any partitions of its child operator.
+ */
+@DeveloperApi
+case class LargeLimit(limit: Int, child: SparkPlan)
+  extends UnaryNode {
+  /** We must copy rows when sort based shuffle is on */
+  private def sortBasedShuffleOn = 
SparkEnv.get.shuffleManager.isInstanceOf[SortShuffleManager]
+
+  override def output: Seq[Attribute] = child.output
+
+  override def executeCollect(): Array[Row] = child.executeTake(limit)
+
+  protected override def doExecute(): RDD[InternalRow] = {
+val rdd = if (sortBasedShuffleOn) {
+  child.execute().map(_.copy()).persist(StorageLevel.MEMORY_AND_DISK)
+} else {
+  child.execute().persist(StorageLevel.MEMORY_AND_DISK)
+}
+
+// We assume the maximize record number in a partition is less than 
Integer.MAX_VALUE
+val partitionRecordCounts = rdd.mapPartitions({ iterator =
+  Iterator(iterator.count(_ = true))
+}, true).collect()
+
+var totalSize = 0
+// how many records we have to take from each partition
+val requiredRecordCounts = partitionRecordCounts.map { count =
--- End diff --

Will it be more efficient to use loop?  For example: you have 1000 
partition count (100, 4, 5, 700, 10 ...), and limit number is 10.  If to use 
loop, you will do the calculation once. But if to choose map, it will do 1000 
times. 
Besides, maybe we can save the storage space for requiredRecordCounts.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-13 Thread GraceH

Github user GraceH commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r37045748
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -224,6 +225,56 @@ case class Limit(limit: Int, child: SparkPlan)
 
 /**
  * :: DeveloperApi ::
+ * Take the first limit elements. and the limit can be any number less 
than Integer.MAX_VALUE.
+ * If it is terminal and is invoked using executeCollect, it probably 
cause OOM if the
+ * records number is large enough. Not like the Limit clause, this 
operator will not change
+ * any partitions of its child operator.
+ */
+@DeveloperApi
+case class LargeLimit(limit: Int, child: SparkPlan)
+  extends UnaryNode {
+  /** We must copy rows when sort based shuffle is on */
+  private def sortBasedShuffleOn = 
SparkEnv.get.shuffleManager.isInstanceOf[SortShuffleManager]
+
+  override def output: Seq[Attribute] = child.output
+
+  override def executeCollect(): Array[Row] = child.executeTake(limit)
+
+  protected override def doExecute(): RDD[InternalRow] = {
+val rdd = if (sortBasedShuffleOn) {
+  child.execute().map(_.copy()).persist(StorageLevel.MEMORY_AND_DISK)
+} else {
+  child.execute().persist(StorageLevel.MEMORY_AND_DISK)
--- End diff --

Beside it is very hard to tell what kind of storage level to pick up. 
Another option may be to add document as the first step and mark this as TODO 
item.   If your limit size is larger than LIMIT, you should run with 
LargeLimit. However, it may bring certain performance loss. But at least, you 
can finish the query without any OOM exception. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130324459
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130324416
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130341196
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/8128#discussion_r36867308
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -224,6 +225,56 @@ case class Limit(limit: Int, child: SparkPlan)
 
 /**
  * :: DeveloperApi ::
+ * Take the first limit elements. and the limit can be any number less 
than Integer.MAX_VALUE.
+ * If it is terminal and is invoked using executeCollect, it probably 
cause OOM if the
+ * records number is large enough. Not like the Limit clause, this 
operator will not change
+ * any partitions of its child operator.
+ */
+@DeveloperApi
+case class LargeLimit(limit: Int, child: SparkPlan)
+  extends UnaryNode {
+  /** We must copy rows when sort based shuffle is on */
+  private def sortBasedShuffleOn = 
SparkEnv.get.shuffleManager.isInstanceOf[SortShuffleManager]
+
+  override def output: Seq[Attribute] = child.output
+
+  override def executeCollect(): Array[Row] = child.executeTake(limit)
+
+  protected override def doExecute(): RDD[InternalRow] = {
+val rdd = if (sortBasedShuffleOn) {
+  child.execute().map(_.copy()).persist(StorageLevel.MEMORY_AND_DISK)
+} else {
+  child.execute().persist(StorageLevel.MEMORY_AND_DISK)
--- End diff --

That's the main reason I put the WIP on the title, persist the RDD would be 
helpful to avoid the re-computation, but I have no idea how to release it 
unpersist it automatically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130326577
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130336313
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130324170
  
cc @JoshRosen @cloud-fan @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130335280
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

2015-08-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8128#issuecomment-130336342
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

[GitHub] spark pull request: [SPARK-9879][SQL][WIP] Fix OOM in Limit clause...

27 matches

Site Navigation

Mail list logo

Footer information