Feynman Liang created SPARK-12806:
-------------------------------------
Summary: Support SQL expressions extracting values from VectorUDT
Key: SPARK-12806
URL: https://issues.apache.org/jira/browse/SPARK-12806
Project: Spark
Issue Type: Improvement
Components: MLlib, SQL
Affects Versions: 1.6.0
Reporter: Feynman Liang
Use cases exist where a specific index within a {VectorUDT} column of a
{DataFrame} is required. For example, we may be interested in extracting a
specific class probability from the {probabilityCol} of a {LogisticRegression}
to compute losses. However, if {probability} is a column of {df} with type
{VectorUDT}, the following code fails:
{code}
df.select("probability.0")
AnalysisException: u"Can't extract value from probability"
{code}
thrown from
{sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala}.
{VectorUDT} essentially wraps a {StructType}, hence one would expect it to
support value extraction Expressions in an analogous way.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]