subject:"Joining DataFrames \- Causing Cartesian Product"

Re: Joining DataFrames - Causing Cartesian Product

2015-12-18 Thread Michael Armbrust

ot;, "USER_DIM_USER_ID") > .withColumnRenamed("USER_CNTRY_ID","USER_DIM_COUNTRY_ID") > .as("userdim") > , userAndRetailDates("USER_ID") <=> $"userdim.USER_DIM_USER_ID" > && userAndRetailDates("US

Re: Joining DataFrames - Causing Cartesian Product

2015-12-18 Thread Prasad Ravilla

R_ID") .withColumnRenamed("USER_CNTRY_ID","USER_DIM_COUNTRY_ID") .as("userdim") , userAndRetailDates("USER_ID") <=> $"userdim.USER_DIM_USER_ID" && userAndRetailDates("USER_CNTRY_ID") <=> $"us

Re: Joining DataFrames - Causing Cartesian Product

2015-12-18 Thread Ted Yu

Can you try the lastest 1.6.0 RC which includes SPARK-1 ? Cheers On Fri, Dec 18, 2015 at 7:38 AM, Prasad Ravilla wrote: > Hi, > > I am running into performance issue when joining data frames created from > avro files using spark-avro library. > > The data frames are created from 120K avro f

Joining DataFrames - Causing Cartesian Product

2015-12-18 Thread Prasad Ravilla

Hi, I am running into performance issue when joining data frames created from avro files using spark-avro library. The data frames are created from 120K avro files and the total size is around 1.5 TB. The two data frames are very huge with billions of records. The join for these two DataFrames

Re: Joining DataFrames - Causing Cartesian Product

Re: Joining DataFrames - Causing Cartesian Product

Re: Joining DataFrames - Causing Cartesian Product

Joining DataFrames - Causing Cartesian Product

4 matches

Site Navigation

Mail list logo

Footer information