Re: Using Hadoop Custom Input format in Spark

2015-10-27 Thread Sabarish Sasidharan
Did you try the sc.binaryFiles() which gives you an RDD of PortableDataStream that wraps around the underlying bytes. On Tue, Oct 27, 2015 at 10:23 PM, Balachandar R.A. wrote: > Hello, > > > I have developed a hadoop based solution that process a binary file. This > uses classic hadoop MR techni

Re: Using Hadoop Custom Input format in Spark

2015-10-27 Thread ayan guha
Mind sharing the error you are getting? On 28 Oct 2015 03:53, "Balachandar R.A." wrote: > Hello, > > > I have developed a hadoop based solution that process a binary file. This > uses classic hadoop MR technique. The binary file is about 10GB and divided > into 73 HDFS blocks, and the business lo

Using Hadoop Custom Input format in Spark

2015-10-27 Thread Balachandar R.A.
Hello, I have developed a hadoop based solution that process a binary file. This uses classic hadoop MR technique. The binary file is about 10GB and divided into 73 HDFS blocks, and the business logic written as map process operates on each of these 73 blocks. We have developed a customInputForma

Re: custom input format in spark

2015-04-16 Thread Akhil Das
>> >> >> >> Thanks >> Best Regards >> >> On Thu, Apr 16, 2015 at 4:18 PM, Shushant Arora < >> shushantaror...@gmail.com> wrote: >> >>> Hi >>> >>> How to specify custom input format in spark and control is

Re: custom input format in spark

2015-04-16 Thread Shushant Arora
g-file-as-single-record-in-hadoop#answers-header > > > > Thanks > Best Regards > > On Thu, Apr 16, 2015 at 4:18 PM, Shushant Arora > wrote: > >> Hi >> >> How to specify custom input format in spark and control isSplitable in >> between file. >>

Re: custom input format in spark

2015-04-16 Thread Akhil Das
18 PM, Shushant Arora wrote: > Hi > > How to specify custom input format in spark and control isSplitable in > between file. > Need to read a file from HDFS , file format is custom and requirement is > file should not be split inbetween when a executor node gets that partition > of

custom input format in spark

2015-04-16 Thread Shushant Arora
Hi How to specify custom input format in spark and control isSplitable in between file. Need to read a file from HDFS , file format is custom and requirement is file should not be split inbetween when a executor node gets that partition of input dir. Can anyone share a sample in java. Thanks