Spark was built using the standard Hadoop libraries of InputFormat and
OutputFormat, so any InputFormat and OutputFormat should ideally be
supported. Besides the simplified interfaces for text files
(sparkContext.textFile(...)
) and seq file (sparkContext.sequenceFile(...) ), you can specify your own
InputFormat and OutputFormat in sparkContext.hadoopFile(...). As suggested
in the first response, checkout the API.

TD


On Sat, Jan 18, 2014 at 10:16 PM, Ankur Chauhan <[email protected]>wrote:

> You may also want to consider Parquet (http://parquet.io). It is pretty
> efficient http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/
>
> -- Ankur Chauhan

Reply via email to