Hi everyone, happy holidays!

I have a Pig script that reads from 4 different folders in Amazon S3. This
is the code:

load_1 = LOAD 's3n://mybucket/{folder_1,folder_2,folder_3,folder_4}'
USING...;

It happens that instead of reading each folder just once and appending the
files Pig/Hadoop reads each folder 4 times.

The input should have 62174 records, but in the end I get 248696.

Why is that? Any ideas?

Thanks,
Rodrigo.

Reply via email to