Hello, gentlemen!

I would like to implement a custom data provider which will create a records
to start map jobs with them. For example I would like to create a thread which
will extract some data from a storage (e.g. relational database) and start a
new job, which will take single record and start map/reduce processing. Each
of such record will produce a lot of results, which will be processed by
reduce task later.

The question is - how to implement such interfaces? As far as I learned, I
would need to implement interfaces InputSplit, RecordReader and and
InputFormat. However after looking at sources and javadocs I found all
operations seems to be file-based, and this file could be split between
several hosts, which isn't my case. I would deal with single stream I need to
parse and start a job.

Thank you in advance!

-- 
Eugene N Dzhurinsky

Attachment: pgpBpff3uhZpt.pgp
Description: PGP signature

Reply via email to