[
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun closed SPARK-47085.
---------------------------------
> Preformance issue on thrift API
> -------------------------------
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.4.1, 3.5.0
> Reporter: Izek Greenfield
> Assignee: Izek Greenfield
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.5.2, 3.4.3, 4.0.0
>
>
> This new complexity was introduced in SPARK-39041.
> before this PR the code was:
> {code:java}
> while (curRow < maxRows && iter.hasNext) {
> val sparkRow = iter.next()
> val row = ArrayBuffer[Any]()
> var curCol = 0
> while (curCol < sparkRow.length) {
> if (sparkRow.isNullAt(curCol)) {
> row += null
> } else {
> addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
> }
> curCol += 1
> }
> resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
> curRow += 1
> }{code}
> foreach without the _*O(n^2)*_ complexity so this change just return the
> state to what it was before.
>
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
> while (i < rowSize) {
> val row = rows(I)
> ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]