Dears.

I have some jsons to be converted to parquet files.
After versin 1.13 I have a very serious issue in this writing. A sample of 
100000 record json document takes 5 minutes to finish.
I'v included the plan.


Overview<http://10.233.50.111:8047/profiles/23329968-0346-e658-b1c9-92fd7dc60d2a#operator-overview>
Operator ID

Type

Avg Setup Time

Max Setup Time

Avg Process Time

Max Process Time

Min Wait Time

Avg Wait Time

Max Wait Time

% Fragment Time

% Query Time

Rows

Avg Peak Memory

Max Peak Memory

00-xx-00

SCREEN

0.000s

0.000s

1.262s

2.510s

0.004s

0.075s

0.145s

0.82%

0.82%

111,491

10MB

20MB

00-xx-01

PROJECT

0.002s

0.002s

0.001s

0.001s

0.000s

0.000s

0.000s

0.00%

0.00%

1

-

-

00-xx-02

PARQUET_WRITER

0.293s

0.293s

50.750s

50.750s

0.000s

0.000s

0.000s

16.44%

16.44%

111,490

-

-

00-xx-03

PROJECT_ALLOW_DUP

0.032s

0.032s

2m0s

2m0s

0.000s

0.000s

0.000s

39.01%

39.01%

111,490

13MB

13MB

00-xx-04

PROJECT

16.092s

16.092s

2m15s

2m15s

0.000s

0.000s

0.000s

43.73%

43.73%

111,490

13MB

13MB




I do not know what the mothods  'PROJECT_ALLOW_DUP' and 'PROJECT' does.
Please tell me what has changed from 1.13 up to now.
I have similar problem in 1.16 release.


and

Best Regards,

      [LOGO1]
Mehran Dashti
Product Leader
09125902452

Reply via email to