Created APEXMALHAR-2116 for this functionality. Please give your feedback on the JIRA ticket.
~ Yogi On 29 April 2016 at 15:58, Sandeep Deshmukh <[email protected]> wrote: > +1 > > Will this support reading a single file in parallel? > On 29-Apr-2016 3:27 pm, "Mohit Jotwani" <[email protected]> wrote: > > > +1 > > > > Regards, > > Mohit > > > > On Thu, Apr 28, 2016 at 4:29 PM, Yogi Devendra < > > [email protected] > > > wrote: > > > > > Hi, > > > > > > My usecase involves reading from HDFS and emit each record as a > separate > > > tuple. Record can be either fixed length record or separator based > record > > > (such as newline). Expected output is byte[] for each record. > > > > > > I am planning to solve this as follows: > > > - New operator which extends BlockReader. > > > - It will have configuration option to select mode for FIXED_LENGTH, > > > SEPARATOR_BASED. > > > - Use appropriate ReaderContext based on mode. > > > > > > Reason for having different operator than BlockReader is because output > > > port signature is different than BlockReader. This new operator can be > > used > > > in conjunction with FileSplitter. > > > > > > Any feedback? > > > > > > ~ Yogi > > > > > >
