[ 
https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048164#comment-16048164
 ] 

Aarati Khobare commented on SPARK-18294:
----------------------------------------


Hi Jiang

I am new to spark and hive, so please let me know if I am missing any point.

We are running an insert command on a hive table (created from a storage 
handler with custom input/output format) through spark 2 shell. The executors 
do start and do their work. But the driver keeps waiting. This is implemented 
using older mappred api. 

Mostly there is a problem with the committer. 
The custom output format class does not have getOutputCommitter() method. I 
looked at the code in SparkHadoopMapReduceWriter.scala. It seems it does not 
take the committer class from JobConf.
Also output formatter's checkOutputSpecs is not called, even if the 
spark.hadoop.validateOutputSpecs property is set to true.

>From JIRA and code it seems that spark2 does not support mapper api and 
>support only map reduce api.
Please let us know if we are missing any thing? 

Thanks.









> Implement commit protocol to support `mapred` package's committer
> -----------------------------------------------------------------
>
>                 Key: SPARK-18294
>                 URL: https://issues.apache.org/jira/browse/SPARK-18294
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core
>            Reporter: Jiang Xingbo
>
> Current `FileCommitProtocol` is based on `mapreduce` package, we should 
> implement a `HadoopMapRedCommitProtocol` that supports the older mapred 
> package's commiter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to