[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47085: -- Fix Version/s: 3.4.3 > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2, 3.4.3 > > > This new complexity was introduced in SPARK-39041. > before this PR the code was: > {code:java} > while (curRow < maxRows && iter.hasNext) { > val sparkRow = iter.next() > val row = ArrayBuffer[Any]() > var curCol = 0 > while (curCol < sparkRow.length) { > if (sparkRow.isNullAt(curCol)) { > row += null > } else { > addNonNullColumnValue(sparkRow, row, curCol, timeFormatters) > } > curCol += 1 > } > resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]]) > curRow += 1 > }{code} > foreach without the _*O(n^2)*_ complexity so this change just return the > state to what it was before. > > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O( n )*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47085: -- Fix Version/s: 3.5.2 > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2 > > > This new complexity was introduced in SPARK-39041. > before this PR the code was: > {code:java} > while (curRow < maxRows && iter.hasNext) { > val sparkRow = iter.next() > val row = ArrayBuffer[Any]() > var curCol = 0 > while (curCol < sparkRow.length) { > if (sparkRow.isNullAt(curCol)) { > row += null > } else { > addNonNullColumnValue(sparkRow, row, curCol, timeFormatters) > } > curCol += 1 > } > resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]]) > curRow += 1 > }{code} > foreach without the _*O(n^2)*_ complexity so this change just return the > state to what it was before. > > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O( n )*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Description: This new complexity was introduced in SPARK-39041. before this PR the code was: {code:java} while (curRow < maxRows && iter.hasNext) { val sparkRow = iter.next() val row = ArrayBuffer[Any]() var curCol = 0 while (curCol < sparkRow.length) { if (sparkRow.isNullAt(curCol)) { row += null } else { addNonNullColumnValue(sparkRow, row, curCol, timeFormatters) } curCol += 1 } resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]]) curRow += 1 }{code} foreach without the _*O(n^2)*_ complexity so this change just return the state to what it was before. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O( n )*_ complexity. was: This new complexity was introduced in SPARK-39041. before this PR the code was: {code:java} def toTTableSchema(schema: StructType): TTableSchema = { val tTableSchema = new TTableSchema() schema.zipWithIndex.foreach { case (f, i) => tTableSchema.addToColumns(toTColumnDesc(f, i)) } tTableSchema } {code} foreach without the _*O(n^2)*_ complexity so this change just return the state to what it was before. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O( n )*_ complexity. > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This new complexity was introduced in SPARK-39041. > before this PR the code was: > {code:java} > while (curRow < maxRows && iter.hasNext) { > val sparkRow = iter.next() > val row = ArrayBuffer[Any]() > var curCol = 0 > while (curCol < sparkRow.length) { > if (sparkRow.isNullAt(curCol)) { > row += null > } else { > addNonNullColumnValue(sparkRow, row, curCol, timeFormatters) > } > curCol += 1 > } > resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]]) > curRow += 1 > }{code} > foreach without the _*O(n^2)*_ complexity so this change just return the > state to what it was before. > > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O( n )*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Description: This new complexity was introduced in SPARK-39041. before this PR the code was: {code:java} def toTTableSchema(schema: StructType): TTableSchema = { val tTableSchema = new TTableSchema() schema.zipWithIndex.foreach { case (f, i) => tTableSchema.addToColumns(toTColumnDesc(f, i)) } tTableSchema } {code} foreach without the _*O(n^2)*_ complexity so this change just return the state to what it was before. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O( n )*_ complexity. was: This new complexity was introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O( n )*_ complexity. > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This new complexity was introduced in SPARK-39041. > before this PR the code was: > {code:java} > def toTTableSchema(schema: StructType): TTableSchema = { > val tTableSchema = new TTableSchema() > schema.zipWithIndex.foreach { case (f, i) => > tTableSchema.addToColumns(toTColumnDesc(f, i)) > } > tTableSchema > } {code} > foreach without the _*O(n^2)*_ complexity so this change just return the > state to what it was before. > > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O( n )*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Description: This new complexity was introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O( n )*_ complexity. was: This new complexity introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O(n)*_ complexity. > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This new complexity was introduced in SPARK-39041. > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O( n )*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Description: This new complexity introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O(n)*_ complexity. was: in class `RowSetUtils` there is a loop that has O(n^2) complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted into O( n ) complexity. > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This new complexity introduced in SPARK-39041. > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O(n)*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Issue Type: Bug (was: Improvement) > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > in class `RowSetUtils` there is a loop that has O(n^2) complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted into O( n ) complexity. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47085: --- Labels: pull-request-available (was: ) > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Priority: Major > Labels: pull-request-available > > in class `RowSetUtils` there is a loop that has O(n^2) complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted into O( n ) complexity. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org