[
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Izek Greenfield updated SPARK-47085:
------------------------------------
Description:
This new complexity was introduced in SPARK-39041.
before this PR the code was:
{code:java}
while (curRow < maxRows && iter.hasNext) {
val sparkRow = iter.next()
val row = ArrayBuffer[Any]()
var curCol = 0
while (curCol < sparkRow.length) {
if (sparkRow.isNullAt(curCol)) {
row += null
} else {
addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
}
curCol += 1
}
resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
curRow += 1
}{code}
foreach without the _*O(n^2)*_ complexity so this change just return the state
to what it was before.
In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
while (i < rowSize) {
val row = rows(I)
...
{code}
It can be easily converted back into _*O( n )*_ complexity.
was:
This new complexity was introduced in SPARK-39041.
before this PR the code was:
{code:java}
def toTTableSchema(schema: StructType): TTableSchema = {
val tTableSchema = new TTableSchema()
schema.zipWithIndex.foreach { case (f, i) =>
tTableSchema.addToColumns(toTColumnDesc(f, i))
}
tTableSchema
} {code}
foreach without the _*O(n^2)*_ complexity so this change just return the state
to what it was before.
In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
while (i < rowSize) {
val row = rows(I)
...
{code}
It can be easily converted back into _*O( n )*_ complexity.
> Preformance issue on thrift API
> -------------------------------
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.4.1, 3.5.0
> Reporter: Izek Greenfield
> Assignee: Izek Greenfield
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity was introduced in SPARK-39041.
> before this PR the code was:
> {code:java}
> while (curRow < maxRows && iter.hasNext) {
> val sparkRow = iter.next()
> val row = ArrayBuffer[Any]()
> var curCol = 0
> while (curCol < sparkRow.length) {
> if (sparkRow.isNullAt(curCol)) {
> row += null
> } else {
> addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
> }
> curCol += 1
> }
> resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
> curRow += 1
> }{code}
> foreach without the _*O(n^2)*_ complexity so this change just return the
> state to what it was before.
>
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
> while (i < rowSize) {
> val row = rows(I)
> ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]