RE: dataframe slow down with tungsten turn on

Cheng, Hao Wed, 04 Nov 2015 20:56:02 -0800

1.5 has critical performance / bug issues, you’d better try 1.5.1 or 1.5.2rc 
version.

From: gen tang [mailto:[email protected]]
Sent: Thursday, November 5, 2015 12:43 PM
To: [email protected]
Subject: Fwd: dataframe slow down with tungsten turn on

Hi,

In fact, I tested the same code with spark 1.5 with tungsten turning off. The 
result is quite the same as tungsten turning on.
It seems that it is not the problem of tungsten, it is simply that spark 1.5 is 
slower than spark 1.4.

Is there any idea about why it happens?
Thanks a lot in advance

Cheers
Gen

---------- Forwarded message ----------
From: gen tang <[email protected]<mailto:[email protected]>>
Date: Wed, Nov 4, 2015 at 3:54 PM
Subject: dataframe slow down with tungsten turn on
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>

Hi sparkers,

I am using dataframe to do some large ETL jobs.
More precisely, I create dataframe from HIVE table and do some operations. And 
then I save it as json.

When I used spark-1.4.1, the whole process is quite fast, about 1 mins. 
However, when I use the same code with spark-1.5.1(with tungsten turn on), it 
takes a about 2 hours to finish the same job.

I checked the detail of tasks, almost all the time is consumed by computation.
[https://owa.gf.com.cn/owa/service.svc/s/GetFileAttachment?id=AAMkAGEzNGJiN2Q4LTI2ODYtNGIyYS1hYWIyLTMzMTYxOGQzYTViNABGAAAAAACPuqp5iM6mRqg7wmvE6c8KBwBKGW%2B6dpgjRb4BfC%2BACXJIAAAAAAEPAABKGW%2B6dpgjRb4BfC%2BACXJIAAAAQcF3AAABEgAQAIeCeL7UEe9GhqECpYfXhDI%3D&X-OWA-CANARY=7U3OIyan90CkQzeCMSlDnFM6WrDs5NIIksHvCIBBNwcmtRNW4tO1_1WPFeb51C1IsASUo1jqj_A.]
Any idea about why this happens?

Thanks a lot in advance for your help.

Cheers
Gen

RE: dataframe slow down with tungsten turn on

Reply via email to