[ https://issues.apache.org/jira/browse/SPARK-17587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-17587: ------------------------------------ Assignee: (was: Apache Spark) > SparseVector __getitem__ should follow __getitem__ contract > ----------------------------------------------------------- > > Key: SPARK-17587 > URL: https://issues.apache.org/jira/browse/SPARK-17587 > Project: Spark > Issue Type: Bug > Components: ML, MLlib, PySpark > Affects Versions: 1.6.0, 2.0.0 > Reporter: Maciej Szymkiewicz > Priority: Minor > > According to {{\_\_getitem\_\_}} > [contract|https://docs.python.org/3/reference/datamodel.html#object.__getitem__]: > {quote} > if of a value outside the set of indexes for the sequence (after any special > interpretation of negative values), {{IndexError}} should be raised. > {quote} > This required for example for correct iteration over the structure. > Right now it throws {{ValueError}} what results in a quite confusing behavior > when attempt to iterate over a vector results in a {{ValueError}} due to > unterminated iteration: > {code} > In [1]: from pyspark.mllib.linalg import SparseVector > In [2]: list(SparseVector(4, [0], [0])) > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > <ipython-input-2-147f3bb0a47d> in <module>() > ----> 1 list(SparseVector(4, [0], [0])) > /opt/spark-2.0/python/pyspark/mllib/linalg/__init__.py in __getitem__(self, > index) > 803 > 804 if index >= self.size or index < -self.size: > --> 805 raise ValueError("Index %d out of bounds." % index) > 806 if index < 0: > 807 index += self.size > ValueError: Index 4 out of bounds. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org