Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415517518


##
core/src/main/scala/org/apache/spark/ui/UIUtils.scala:
##
@@ -431,7 +431,7 @@ private[spark] object UIUtils extends Logging {
 }
 
 val headerRow: Seq[Node] = {
-  headers.view.zipWithIndex.map { x =>
+  headers.zipWithIndex.map { x =>

Review Comment:
   
https://github.com/apache/spark/pull/2867/files#diff-809c93c57cc59e5fe3c3eb54a24aa96a38147d02323f3e690ae6b5309a3284d2
   
   
![image](https://github.com/apache/spark/assets/1475305/d92f444c-4fe7-455c-9416-0785a5d93d27)
   
   seems that the initial intention here was also to consciously use a lazy 
view, also test the LazyList



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415496286


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala:
##
@@ -388,7 +388,7 @@ object CompactibleFileStreamLog {
 } else if (defaultInterval < (latestCompactBatchId + 1) / 2) {
   // Find the first divisor >= default compact interval
   def properDivisors(min: Int, n: Int) =
-(min to n/2).view.filter(i => n % i == 0).toSeq :+ n
+(min to n / 2).to(LazyList).filter(i => n % i == 0) :+ n

Review Comment:
   @yaooqinn 
   
https://github.com/apache/spark/pull/15852/files#diff-ed29e1893fa3939724b333684e1f6037358b3d7edce1e3663afdda8d99002fed
   
   
![image](https://github.com/apache/spark/assets/1475305/03e8f4dc-cada-48de-8d8d-206ad1dc6d83)
   
   seems that the initial intention here was also to use a lazy view, test 
`LazyList`
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415496286


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala:
##
@@ -388,7 +388,7 @@ object CompactibleFileStreamLog {
 } else if (defaultInterval < (latestCompactBatchId + 1) / 2) {
   // Find the first divisor >= default compact interval
   def properDivisors(min: Int, n: Int) =
-(min to n/2).view.filter(i => n % i == 0).toSeq :+ n
+(min to n / 2).to(LazyList).filter(i => n % i == 0) :+ n

Review Comment:
   @yaooqinn 
   
https://github.com/apache/spark/pull/15852/files#diff-ed29e1893fa3939724b333684e1f6037358b3d7edce1e3663afdda8d99002fed
   
   
![image](https://github.com/apache/spark/assets/1475305/03e8f4dc-cada-48de-8d8d-206ad1dc6d83)
   
   seems that the initial intention here was also to use a lazy view
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415490110


##
mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala:
##
@@ -187,7 +187,7 @@ class GaussianMixture private (
   case None =>
 val samples = breezeData.takeSample(withReplacement = true, k * 
nSamples, seed)
 (Array.fill(k)(1.0 / k), Array.tabulate(k) { i =>
-  val slice = samples.view.slice(i * nSamples, (i + 1) * nSamples)
+  val slice = samples.slice(i * nSamples, (i + 1) * nSamples)

Review Comment:
   thanks @zhengruifeng 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


zhengruifeng commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415479079


##
mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala:
##
@@ -187,7 +187,7 @@ class GaussianMixture private (
   case None =>
 val samples = breezeData.takeSample(withReplacement = true, k * 
nSamples, seed)
 (Array.fill(k)(1.0 / k), Array.tabulate(k) { i =>
-  val slice = samples.view.slice(i * nSamples, (i + 1) * nSamples)
+  val slice = samples.slice(i * nSamples, (i + 1) * nSamples)

Review Comment:
   the ML part should be fine if all tests pass



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415455947


##
mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala:
##
@@ -187,7 +187,7 @@ class GaussianMixture private (
   case None =>
 val samples = breezeData.takeSample(withReplacement = true, k * 
nSamples, seed)
 (Array.fill(k)(1.0 / k), Array.tabulate(k) { i =>
-  val slice = samples.view.slice(i * nSamples, (i + 1) * nSamples)
+  val slice = samples.slice(i * nSamples, (i + 1) * nSamples)

Review Comment:
   also cc @zhengruifeng could you help review the changes in mllib, thanks 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415331079


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   Got, let me test to use `LazyList` here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415266748


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   If we want to do this refactor for `SparkStringUtils#truncatedString` 
function, I can revert this change and create a Jira to track it.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


yaooqinn commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415318597


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   I guess the `view` was added here first for performance consideration. 
Later, the `toSeq` was added for compatibility with Scala 2.13, breaking the 
original intention.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415263533


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   No, The `.toSeq` will trigger computation, so this view doesn't achieve its 
intended effect, unless we refactor the `truncatedString` method to accept 
`SeqView` instead of `Seq`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415266748


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   If we want to do this refactor for `SparkStringUtils#truncatedString` 
function, I can revert this change and create a Jira to track it."
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415263533


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   No, The `.toSeq` will trigger computation, so this view doesn't achieve its 
intended effect, unless we refactor the truncatedString method to accept 
`SeqView` instead of `Seq`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415263533


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   No, The `.toSeq` will trigger computation, so this view doesn't achieve its 
intended effect, unless we refactor the truncatedString method to accept 
SeqView instead of Seq.



##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   No, The `.toSeq` will trigger computation, so this view doesn't achieve its 
intended effect, unless we refactor the truncatedString method to accept 
SeqView instead of Seq.



##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   No, The `.toSeq` will trigger computation, so this view doesn't achieve its 
intended effect, unless we refactor the truncatedString method to accept 
`SeqView` instead of `Seq`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


yaooqinn commented on code in PR #44179:
URL: https://github.com/apache/spark/pull/44179#discussion_r1415256193


##
sql/api/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##
@@ -417,7 +417,7 @@ case class StructType(fields: Array[StructField]) extends 
DataType with Seq[Stru
   override def defaultSize: Int = fields.map(_.dataType.defaultSize).sum
 
   override def simpleString: String = {
-val fieldTypes = fields.view.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq
+val fieldTypes = fields.map(field => 
s"${field.name}:${field.dataType.simpleString}").toSeq

Review Comment:
   Is creating a view for lazy transformation better as we have 
maxToStringFields limits?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions [spark]

2023-12-05 Thread via GitHub


LuciferYang commented on PR #44179:
URL: https://github.com/apache/spark/pull/44179#issuecomment-1840380311

   Could you review this PR? @cloud-fan @dongjoon-hyun @yaooqinn 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46263][SQL][SS][ML][MLLIB][UI] Clean up unnecessary `SeqOps.view` and `ArrayOps.view` conversions. [spark]

2023-12-04 Thread via GitHub


LuciferYang commented on PR #44179:
URL: https://github.com/apache/spark/pull/44179#issuecomment-1840132785

   Let me double check the change first


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org