Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    This approach doesn't feel right to me. The goal of the change is to avoid 
making a pass over the values to find out if there are any explicit zeros that 
need to be eliminated, which is fine. Instead of allowing the user to specify 
how man non-zero elements there are, we should instead allow them to specify a 
Boolean value on whether or not we should bother removing explicit zeros. 
Here's a small example to demonstrate why:
    
    ````scala
    val v = Vectors.dense(1, 2, 3)
    val sv = v.toSparse(2)
    ````
    
    This raises a `java.lang.ArrayIndexOutOfBoundsException`. Since I 
incorrectly specified the number of non-zero elements. If instead we use the 
boolean like `removeExplicitZeros` then when that is `false` we can just make 
the `values` array the size of the dense `values in the dense case, and just 
return `this` in the Sparse case. So I'm suggesting a method like:
    
    ````scala
    def toSparse(removeExplicitZeros: Boolean): SparseVector
    ````
    
    That won't work anyway because of ambiguous reference compile errors 
(another reason unit tests are so important). I ran into that problem before, 
and never found a good solution, and so you'll have to come up with a way 
around that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to