[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16924


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-16 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101502214
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -137,7 +138,13 @@ private[spark] class EventLoggingListener(
 // scalastyle:on println
 if (flushLogger) {
   writer.foreach(_.flush())
-  hadoopDataStream.foreach(_.hflush())
+  hadoopDataStream.foreach(ds => {
--- End diff --

OK, and it's not better to just call hsync in all cases -- you have to 
special case this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-16 Thread dosoft
Github user dosoft commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101487342
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -137,7 +138,13 @@ private[spark] class EventLoggingListener(
 // scalastyle:on println
 if (flushLogger) {
   writer.foreach(_.flush())
-  hadoopDataStream.foreach(_.hflush())
+  hadoopDataStream.foreach(ds => {
--- End diff --

hsync() is even stronger than hflush(), since under the cover both methods 
use the same flushOrSync(), but hsync performs an additional tasks like 
flushing OS buffers (fsync).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-15 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101403013
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -137,7 +138,13 @@ private[spark] class EventLoggingListener(
 // scalastyle:on println
 if (flushLogger) {
   writer.foreach(_.flush())
-  hadoopDataStream.foreach(_.hflush())
+  hadoopDataStream.foreach(ds => {
--- End diff --

OK, if in doubt, would it be perhaps safer to preserve the existing 
behavior and hflush in all cases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-14 Thread dosoft
Github user dosoft commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101161834
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -137,7 +138,13 @@ private[spark] class EventLoggingListener(
 // scalastyle:on println
 if (flushLogger) {
   writer.foreach(_.flush())
-  hadoopDataStream.foreach(_.hflush())
+  hadoopDataStream.foreach(ds => {
--- End diff --

seems like hflush() is not required there


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101126469
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -20,16 +20,17 @@ package org.apache.spark.scheduler
 import java.io._
 import java.net.URI
 import java.nio.charset.StandardCharsets
+import java.util
--- End diff --

Import the class please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101126998
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -137,7 +138,13 @@ private[spark] class EventLoggingListener(
 // scalastyle:on println
 if (flushLogger) {
   writer.foreach(_.flush())
-  hadoopDataStream.foreach(_.hflush())
+  hadoopDataStream.foreach(ds => {
--- End diff --

```
...foreach(df => df.getWrappedStream match {
case wrapped: DFSOutputStream => wrapped.hsync(...)
case _ => df.hflush()
  }) 
```
 maybe? I think that 95% works. You don't hflush in the first case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-14 Thread dosoft
GitHub user dosoft opened a pull request:

https://github.com/apache/spark/pull/16924

[SPARK-19531] Send UPDATE_LENGTH for Spark History service

## What changes were proposed in this pull request?

During writing to the .inprogress file (stored on the HDFS) Hadoop doesn't 
update file length until close and therefor Spark's history server can't detect 
any changes. We have to send UPDATE_LENGTH manually.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dosoft/spark SPARK-19531

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16924.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16924


commit f87c5155832435c9dc17053521d61ae0ce06f8d8
Author: Oleg Danilov 
Date:   2017-02-01T13:06:22Z

[SPARK-19531] Send UPDATE_LENGTH for Spark History service




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org