[
https://issues.apache.org/jira/browse/HADOOP-17977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ismail updated HADOOP-17977:
----------------------------
Description:
is it possible to make `{{PENDING_DIR_NAME}}` configurable?
That will enable concurrent writes to same location. current if two spark
processes write same destination one of them is failing.
current
{code:java}
public static final String PENDING_DIR_NAME = "_temporary";{code}
new:
{code:java}
PENDING_DIR_NAME = conf.get("mapreduce.fileoutputcommitter.pending.dir",
"_temporary");{code}
here is custom commiter doing it:
https://gist.github.com/ismailsimsek/33c55d8e1fcfc79160483c38a978edbd
Labels: easyfix (was: )
Summary: FileOutputCommitter Enable Concurent Writes (was: Enable )
> FileOutputCommitter Enable Concurent Writes
> --------------------------------------------
>
> Key: HADOOP-17977
> URL: https://issues.apache.org/jira/browse/HADOOP-17977
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: ismail
> Priority: Major
> Labels: easyfix
>
> is it possible to make `{{PENDING_DIR_NAME}}` configurable?
> That will enable concurrent writes to same location. current if two spark
> processes write same destination one of them is failing.
> current
> {code:java}
> public static final String PENDING_DIR_NAME = "_temporary";{code}
> new:
> {code:java}
> PENDING_DIR_NAME = conf.get("mapreduce.fileoutputcommitter.pending.dir",
> "_temporary");{code}
> here is custom commiter doing it:
> https://gist.github.com/ismailsimsek/33c55d8e1fcfc79160483c38a978edbd
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]