[jira] [Commented] (HADOOP-15634) LocalDirAllocator using up local nonDFS when set to S3

Aaron Fabbri (JIRA) Thu, 26 Jul 2018 15:00:18 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558954#comment-16558954
 ]


Aaron Fabbri commented on HADOOP-15634:
---------------------------------------

{quote}why do we need local disk.
{quote}
For future reference: check out the 
[documentation|https://hadoop.apache.org/docs/current3/hadoop-aws/tools/hadoop-aws/index.html]
 for S3A. See the section "How S3A Writes to S3". (The open source S3A 
connector also has support for in-memory buffering and the doc offers some 
numbers for calculating how much memory this would require--which is the main 
downside.) . As Steve said, this only applies to the Apache Hadoop code. EMR's 
S3 client is not open sourced.

> LocalDirAllocator using up local nonDFS when set to S3
> ------------------------------------------------------
>
>                 Key: HADOOP-15634
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15634
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.8.3
>         Environment: EMR-5.15, Hadoop-2.8.3, Hive-2.3.3, Tez-0.8.4, Beeline. 
> Target table is defined for ACID transactions with location on S3. 
> Insert source table is on S3. 
>            Reporter: Phani Kondapalli
>            Priority: Blocker
>
> Manually modified the yarn-site.xml from within the EMR, set the param 
> yarn.nodemanager.local-dirs to point to s3, reloaded the services on Master 
> and Core nodes. Disk seemed to stay intact but hdfs dfsadmin -report showed 
> nonDFS usage and then finally it failed with below error.
> Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1532581073633_0001_2_00, diagnostics=[Task failed, 
> taskId=task_1532581073633_0001_2_00_000898, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1532581073633_0001_2_00_000898_0:org.apache.hadoop.util.DiskChecker$DiskErrorException:
>  Could not find any valid local directory for 
> output/attempt_1532581073633_0001_2_00_000898_0_10013_1/file.out
>  at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:441)
>  at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151)
>  at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132)
>  at 
> org.apache.tez.runtime.library.common.task.local.output.TezTaskOutputFiles.getSpillFileForWrite(TezTaskOutputFiles.java:207)
>  at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:545)
> ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15634) LocalDirAllocator using up local nonDFS when set to S3

Reply via email to