That probably means that your problem is pretty easy. Just code up a standard rules engine into a mapper. You can also build a user defined function (UDF) in Pig or Hive and Hadoop will handle the parallelism for you.
On Sat, Oct 20, 2012 at 6:48 AM, Luangsay Sourygna <[email protected]>wrote: > My problem would be similar to the first option you write: > I have a few number of rules (let's say, < 1000) and a huge number of > inputs (= big data part). >
