I have this pig job something like this
data = LOAD 'database.table' USING org.apache.hive.hcatalog.pig.HCatLoader(); SPLIT data INTO A if type == 'A', B otherwise A = DISTINCT A PARALLEL 200; B = DISTINCT B PARALLEL 10; STORE A INTO 'database.output_table' USING org.apache.hive.hcatalog.pig.HCatStorer('type=A'); STORE B INTO 'database.output_table' USING org.apache.hive.hcatalog.pig.HCatStorer('type=B'); However, it runs as a single mapreduce job and results into both A and B partitions having 200 files. I want the number of files for A to be 200 and for B to be 10. How can I fix this?