Dear all,
I want to test the different multiple join orders' efficiency. However,
since the pig query is executed lazily, I need to use dump or store to let
the query be executed.
Now, I use the following query to test the efficiency.
*Bad_OrderIn = JOIN inventory BY inv_item_sk, catalog_sales
Thanks for your quick reply. If so, I can use the limit operator to compare
good and bad join plan. It takes time to dump all.
Bests,
Mingda
On Tue, Dec 6, 2016 at 5:23 PM, Zhang, Liyun wrote:
> Hi:
>I think the query time about multiple join part is not related with the
> number of limit
Hi,
I am running a multiple join of 100G TPC-DS data with bad order on our
cluster. And each time, it returns such log file to me with the exception:
Has anyone ever met it? Is it caused by too much data more than disk space?
* org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/tmp