[jira] [Closed] (SPARK-4320) JavaPairRDD should supply a saveAsNewHadoopDataset which takes a Job object

Corey J. Nolet (JIRA) Wed, 25 Feb 2015 16:16:24 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Corey J. Nolet closed SPARK-4320.
---------------------------------
          Resolution: Won't Fix
    Target Version/s: 1.2.1, 1.1.2  (was: 1.1.2, 1.2.1)

> JavaPairRDD should supply a saveAsNewHadoopDataset which takes a Job object 
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-4320
>                 URL: https://issues.apache.org/jira/browse/SPARK-4320
>             Project: Spark
>          Issue Type: Improvement
>          Components: Input/Output, Spark Core
>            Reporter: Corey J. Nolet
>
> I am outputting data to Accumulo using a custom OutputFormat. I have tried 
> using saveAsNewHadoopFile() and that works- though passing an empty path is a 
> bit weird. Being that it isn't really a file I'm storing, but rather a  
> generic Pair dataset, I'd be inclined to use the saveAsHadoopDataset() 
> method, though I'm not at all interested in using the legacy mapred API.
> Perhaps we could supply a saveAsNewHadoopDateset method. Personally, I think 
> there should be two ways of calling into this method. Instead of forcing the 
> user to always set up the Job object explicitly, I'm in the camp of having 
> the following method signature:
> saveAsNewHadoopDataset(keyClass : Class[K], valueClass : Class[V], ofclass : 
> Class[? extends OutputFormat], conf : Configuration). This way, if I'm 
> writing spark jobs that are going from Hadoop back into Hadoop, I can 
> construct my Configuration once.
> Perhaps an overloaded method signature could be:
> saveAsNewHadoopDataset(job : Job)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Closed] (SPARK-4320) JavaPairRDD should supply a saveAsNewHadoopDataset which takes a Job object

Reply via email to