[
https://issues.apache.org/jira/browse/SPARK-50788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gengliang Wang reassigned SPARK-50788:
--------------------------------------
Assignee: Yuchuan Huang
> Add Benchmark for Large-Row Dataframe
> -------------------------------------
>
> Key: SPARK-50788
> URL: https://issues.apache.org/jira/browse/SPARK-50788
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.0
> Reporter: Yuchuan Huang
> Assignee: Yuchuan Huang
> Priority: Major
> Labels: pull-request-available
>
> This proposal aims to introduce a new micro benchmark “LargeRowBenchmark” to
> check Spark's support/performance on large-row dataframes. A large-row
> dataframe is the one with MB-size cells (think about online chatting
> records). Different from existing WideTableBenchmark where the dataframe has
> more-than-normal columns, this benchmark focuses on dataframes with normal
> amount of columns but larger-than-usual cells. The purpose of this proposal
> is to add large-row dataframe to future performance regression check.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]