Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The following page has been changed by ShravanNarayanamurthy: http://wiki.apache.org/pig/PigFRJoin ------------------------------------------------------------------------------ || '''Experiment 1: Reducing Join (1.2, 2.1.*.2, 3.1, 4.1, 5.2)''' || '''Experiment 2: Expanding Join (1.2, 2.2.*.1, 3.1, 4.1, 5.2)''' || || attachment:GrpTimes-off.png || attachment:exp-grp-times.png || || '''Experiment 3: Utilization: (1.2, 2.1.*.2, 3.1, 4.1, 5.2)''' || '''Experiment 4: Sorted Bag (1.2, 2.1.*.2, 3.1, 4.1, 5.2)''' || - || We measure the utilization of the cluster by the various algorithms by running 10 homogenous jobs simultaneously and calculating the number of jobs same sized clusters can run in a minute for the different algorithms. The following graphs give the results of the experiments ran. || '''TBD''' || + || We measure the utilization of the cluster by the various algorithms by running 10 homogenous jobs simultaneously and calculating the number of jobs same sized clusters can run in a minute for the different algorithms. The following graphs give the results of the experiments ran. || One serious limitation of FRJ is that it tries to read the replicated table into memory. If the file is even slightly bigger, it dies with out of memory exception. In order to work around this problem, we can read the replicated tables and also the fragment of the fragmented table into Sorted Bags, which are disk-backed structures, and perform a merge join. However, from the graphs below it doesn't seem like a viable alternative || || attachment:util.png || attachment:bagjoin.png ||
