[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47085:
--
Fix Version/s: 3.4.3

> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>
> This new complexity was introduced in SPARK-39041.
> before this PR the code was:
> {code:java}
> while (curRow < maxRows && iter.hasNext) {
>   val sparkRow = iter.next()
>   val row = ArrayBuffer[Any]()
>   var curCol = 0
>   while (curCol < sparkRow.length) {
> if (sparkRow.isNullAt(curCol)) {
>   row += null
> } else {
>   addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
> }
> curCol += 1
>   }
>   resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
>   curRow += 1
> }{code}
>  foreach without the _*O(n^2)*_ complexity so this change just return the 
> state to what it was before.
>  
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47085:
--
Fix Version/s: 3.5.2

> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2
>
>
> This new complexity was introduced in SPARK-39041.
> before this PR the code was:
> {code:java}
> while (curRow < maxRows && iter.hasNext) {
>   val sparkRow = iter.next()
>   val row = ArrayBuffer[Any]()
>   var curCol = 0
>   while (curCol < sparkRow.length) {
> if (sparkRow.isNullAt(curCol)) {
>   row += null
> } else {
>   addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
> }
> curCol += 1
>   }
>   resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
>   curRow += 1
> }{code}
>  foreach without the _*O(n^2)*_ complexity so this change just return the 
> state to what it was before.
>  
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-20 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Description: 
This new complexity was introduced in SPARK-39041.

before this PR the code was:
{code:java}
while (curRow < maxRows && iter.hasNext) {
  val sparkRow = iter.next()
  val row = ArrayBuffer[Any]()
  var curCol = 0
  while (curCol < sparkRow.length) {
if (sparkRow.isNullAt(curCol)) {
  row += null
} else {
  addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
}
curCol += 1
  }
  resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
  curRow += 1
}{code}
 foreach without the _*O(n^2)*_ complexity so this change just return the state 
to what it was before.

 

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O( n )*_ complexity.

 

 

  was:
This new complexity was introduced in SPARK-39041.

before this PR the code was:


{code:java}
def toTTableSchema(schema: StructType): TTableSchema = {
  val tTableSchema = new TTableSchema()
  schema.zipWithIndex.foreach { case (f, i) =>
tTableSchema.addToColumns(toTColumnDesc(f, i))
  }
  tTableSchema
} {code}
 foreach without the _*O(n^2)*_ complexity so this change just return the state 
to what it was before.

 

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O( n )*_ complexity.

 

 


> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity was introduced in SPARK-39041.
> before this PR the code was:
> {code:java}
> while (curRow < maxRows && iter.hasNext) {
>   val sparkRow = iter.next()
>   val row = ArrayBuffer[Any]()
>   var curCol = 0
>   while (curCol < sparkRow.length) {
> if (sparkRow.isNullAt(curCol)) {
>   row += null
> } else {
>   addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
> }
> curCol += 1
>   }
>   resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
>   curRow += 1
> }{code}
>  foreach without the _*O(n^2)*_ complexity so this change just return the 
> state to what it was before.
>  
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-20 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Description: 
This new complexity was introduced in SPARK-39041.

before this PR the code was:


{code:java}
def toTTableSchema(schema: StructType): TTableSchema = {
  val tTableSchema = new TTableSchema()
  schema.zipWithIndex.foreach { case (f, i) =>
tTableSchema.addToColumns(toTColumnDesc(f, i))
  }
  tTableSchema
} {code}
 foreach without the _*O(n^2)*_ complexity so this change just return the state 
to what it was before.

 

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O( n )*_ complexity.

 

 

  was:
This new complexity was introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O( n )*_ complexity.

 

 


> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity was introduced in SPARK-39041.
> before this PR the code was:
> {code:java}
> def toTTableSchema(schema: StructType): TTableSchema = {
>   val tTableSchema = new TTableSchema()
>   schema.zipWithIndex.foreach { case (f, i) =>
> tTableSchema.addToColumns(toTColumnDesc(f, i))
>   }
>   tTableSchema
> } {code}
>  foreach without the _*O(n^2)*_ complexity so this change just return the 
> state to what it was before.
>  
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-19 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Description: 
This new complexity was introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O( n )*_ complexity.

 

 

  was:
This new complexity introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O(n)*_ complexity.

 

 


> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity was introduced in SPARK-39041.
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-19 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Description: 
This new complexity introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O(n)*_ complexity.

 

 

  was:
in class `RowSetUtils` there is a loop that has O(n^2) complexity:


{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}

It can be easily converted into O( n ) complexity. 


> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity introduced in SPARK-39041.
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O(n)*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-19 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Issue Type: Bug  (was: Improvement)

> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> in class `RowSetUtils` there is a loop that has O(n^2) complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted into O( n ) complexity. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47085:
---
Labels: pull-request-available  (was: )

> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
>
> in class `RowSetUtils` there is a loop that has O(n^2) complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted into O( n ) complexity. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org