Dears. I have some jsons to be converted to parquet files. After versin 1.13 I have a very serious issue in this writing. A sample of 100000 record json document takes 5 minutes to finish. I'v included the plan.
Overview<http://10.233.50.111:8047/profiles/23329968-0346-e658-b1c9-92fd7dc60d2a#operator-overview> Operator ID Type Avg Setup Time Max Setup Time Avg Process Time Max Process Time Min Wait Time Avg Wait Time Max Wait Time % Fragment Time % Query Time Rows Avg Peak Memory Max Peak Memory 00-xx-00 SCREEN 0.000s 0.000s 1.262s 2.510s 0.004s 0.075s 0.145s 0.82% 0.82% 111,491 10MB 20MB 00-xx-01 PROJECT 0.002s 0.002s 0.001s 0.001s 0.000s 0.000s 0.000s 0.00% 0.00% 1 - - 00-xx-02 PARQUET_WRITER 0.293s 0.293s 50.750s 50.750s 0.000s 0.000s 0.000s 16.44% 16.44% 111,490 - - 00-xx-03 PROJECT_ALLOW_DUP 0.032s 0.032s 2m0s 2m0s 0.000s 0.000s 0.000s 39.01% 39.01% 111,490 13MB 13MB 00-xx-04 PROJECT 16.092s 16.092s 2m15s 2m15s 0.000s 0.000s 0.000s 43.73% 43.73% 111,490 13MB 13MB I do not know what the mothods 'PROJECT_ALLOW_DUP' and 'PROJECT' does. Please tell me what has changed from 1.13 up to now. I have similar problem in 1.16 release. and Best Regards, [LOGO1] Mehran Dashti Product Leader 09125902452