[ 
https://issues.apache.org/jira/browse/SPARK-25855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667270#comment-16667270
 ] 

Steve Loughran commented on SPARK-25855:
----------------------------------------

thx for the mention. yes, looking @ stream capabilities is the way to go; 
HDFS-11644 shows that was added precisely because the alternative was to have 
some static "WhenWeSaidWeImplementSyncableWeLied" kind of interface, which 
isn't dynamic enough or sustainable long term (what happens if you subclass 
that, ....)

See also HADOOP-13327

> Don't use Erasure Coding for event log files
> --------------------------------------------
>
>                 Key: SPARK-25855
>                 URL: https://issues.apache.org/jira/browse/SPARK-25855
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Imran Rashid
>            Priority: Major
>
> While testing spark with hdfs erasure coding (new in hadoop 3), we ran into a 
> bug with the event logs.  The main issue was a bug in hdfs (HDFS-14027), but 
> it did make us wonder whether Spark should be using EC for event log files in 
> general.  Its a poor choice because EC currently implements {{hflush()}} or 
> {{hsync()}} as no-ops, which mean you won't see anything in your event logs 
> until the app is complete.  That isn't necessarily a bug, but isn't really 
> great.  So I think we should ensure EC is always off for event logs.
> IIUC there is *not* a problem with applications which die without properly 
> closing the outputstream.  It'll take a while for the NN to realize the 
> client is gone and finish the block, but the data should get there eventually.
> Also related are SPARK-24787 & SPARK-19531.
> The space savings from EC would be nice as the event logs can get somewhat 
> large, but I think other factors outweigh this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to