Hello, gentlemen! I would like to implement a custom data provider which will create a records to start map jobs with them. For example I would like to create a thread which will extract some data from a storage (e.g. relational database) and start a new job, which will take single record and start map/reduce processing. Each of such record will produce a lot of results, which will be processed by reduce task later.
The question is - how to implement such interfaces? As far as I learned, I would need to implement interfaces InputSplit, RecordReader and and InputFormat. However after looking at sources and javadocs I found all operations seems to be file-based, and this file could be split between several hosts, which isn't my case. I would deal with single stream I need to parse and start a job. Thank you in advance! -- Eugene N Dzhurinsky
pgpBpff3uhZpt.pgp
Description: PGP signature
