I had the same problem. You can search the mailing list to find out more about it. But, in a nut shell, this happens only when pig calculated the number of reducers it needs. It will go away if you specify the number of reducers in the join step. Try it and tell us if that works.
________________________________ From: Simonffy Szilvia <[email protected]> To: [email protected] Sent: Thursday, August 1, 2013 11:31 PM Subject: Fwd: Problem with using CROSS in PIG Hi, I wrote a pig script, and I got not consequent result when running more times the same script. pig version: pig: 0.11.1 hadoop version: 1.1.2 / 4 node pig script: A = LOAD '/tmp/data' AS (request_datetime: chararray, portal_name: chararray, sku: chararray, product_name: chararray, duration: int); B = FILTER A BY portal_name == 'portal1'; C = FILTER B BY sku == '4505865'; sequence_numbers = LOAD 'sequence_numbers' USING org.apache.hcatalog.pig.HCatLoader(); sequence_number = FILTER sequence_numbers BY key == '20071224_20071230'; sequence_number = FOREACH sequence_number GENERATE seq AS seq; sequence_number = LIMIT sequence_number 1; D = CROSS C, sequence_number; E = FOREACH D GENERATE request_datetime AS request_datetime, portal_name AS portal_name, sku AS sku, product_name AS product_name, duration AS duration, seq AS seq; STORE E INTO '/tmp/data/output/' using PigStorage(); Execution results after five times running: 1. Successfully stored 3 records 2. Successfully stored 5 records 3. Successfully stored 2 records 4. Successfully stored 3 records 5. Successfully stored 1 records Can anybody tell me what is wrong? ps.: I made a workaround for skip CROSS, and use join instead of cross. D JOIN C BY identifier, report_sequence_number BY identifier; //where identifier is a constant number:1 With this changes the result is correct every time. data: /tmp/data/data.tsv 2013-03-14T10:07:14 portal1 4505865 Julsång (Cantique de Noël) (1997 Digital Remaster) 304 2013-03-14T22:55:49 portal1 4505865 Julsång (Cantique de Noël) (1997 Digital Remaster) 304 2013-03-19T09:11:03 portal1 4505865 Julsång (Cantique de Noël) (1997 Digital Remaster) 304 2013-03-19T09:23:49 portal1 4505865 Julsång (Cantique de Noël) (1997 Digital Remaster) 304 2013-03-19T09:23:49 portal1 4505865 Julsång (Cantique de Noël) (1997 Digital Remaster) 304 2013-03-17T13:36:15 portal1 4505865 Julsång (Cantique de Noël) (1997 Digital Remaster) 304 2013-03-01T09:07:34 portal1 310451 Heroes (Single Version) 215 2013-03-16T16:13:17 portal1 310451 Heroes (Single Version) 215 2013-03-18T23:19:17 portal1 310451 Heroes (Single Version) 215 2013-03-15T07:47:37 portal1 310451 Heroes (Single Version) 215 2013-03-19T13:48:03 portal1 310451 Heroes (Single Version) 215 2013-03-13T15:17:29 portal1 310451 Heroes (Single Version) 215 2013-03-14T14:34:40 portal1 310451 Heroes (Single Version) 215 data: /tmp/sequence_numbers/data.tsv 20071224_20071230 100 20071231_20080106 101 20080107_20080113 102 20080114_20080120 103 20080121_20080127 104 20080128_20080203 105 20080204_20080210 106 20080211_20080217 107 20080218_20080224 108 20080225_20080302 109 br, Szilvi
