jerqi commented on PR #294:
URL:
https://github.com/apache/incubator-uniffle/pull/294#issuecomment-1344157680
> # Performance Test
> ## Table
> Table1: 10g, dtypes: Array[(String, String)] = Array((v1,StringType),
(k1,StringType)). And all columns of k1 have the same value (value = 10)
>
> Table2: 10 records, dtypes: Array[(String, String)] =
Array((k2,StringType), (v2,StringType)). And it has the only one record of k2=10
>
> ## Env
> Spark Resource Profile: 10 executors(1core4g) Shuffle-server Environment:
6 shuffle servers, 20g for buffer read and 40g for buffer write. Spark Shuffle
Client Config: storage type: MEMORY_LOCALFILE_HDFS with LOCAL_ORDER SQL:
spark.sql("select * from Table1,Table2 where k1 =
k2").write.mode("overwrite").parquet("xxxxxx")
>
> ## Result
> `BITMAP` and `MINMAX` look similar. I think their gap has little impact on
the overall performance. See the following picture.

>
> cc @jerqi @zuston
OK.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]