[jira] [Resolved] (SPARK-31123) Drop does not work after join with aliases

2020-03-17 Thread Mikel San Vicente (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente resolved SPARK-31123.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Drop does not work after join with aliases
> --
>
> Key: SPARK-31123
> URL: https://issues.apache.org/jira/browse/SPARK-31123
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.2
>Reporter: Mikel San Vicente
>Priority: Major
> Fix For: 3.0.0
>
>
>  
> Hi,
> I am seeing a really strange behaviour in drop method after a join with 
> aliases. It doesn't seem to find the column when I reference to it using 
> dataframe("columnName") syntax, but it does work with other combinators like 
> select
> {code:java}
> case class Record(a: String, dup: String)
> case class Record2(b: String, dup: String)
> val df = Seq(Record("a", "dup")).toDF
> val df2 = Seq(Record2("a", "dup")).toDF 
> val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
> val dupCol = df("dup")
> joined.drop(dupCol) // Does not drop anything
> joined.drop(func.col("a.dup")) // It drops the column  
> joined.select(dupCol) // It selects the column
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31123) Drop does not work after join with aliases

2020-03-16 Thread Mikel San Vicente (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-31123:
--
Description: 
 

Hi,

I am seeing a really strange behaviour in drop method after a join with 
aliases. It doesn't seem to find the column when I reference to it using 
dataframe("columnName") syntax, but it does work with other combinators like 
select
{code:java}
case class Record(a: String, dup: String)
case class Record2(b: String, dup: String)

val df = Seq(Record("a", "dup")).toDF
val df2 = Seq(Record2("a", "dup")).toDF 
val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
val dupCol = df("dup")
joined.drop(dupCol) // Does not drop anything
joined.drop(func.col("a.dup")) // It drops the column  
joined.select(dupCol) // It selects the column
{code}
 

 

 

  was:
 

Hi,

I am seeing a really strange behaviour in drop method after a join with 
aliases. It doesn't seem to find the column when I reference to it using 
dataframe("columnName") syntax, but it does work with other combinators like 
select
{code:java}
case class Record(a: String, dup: String)
case class Record2(b: String, dup: String)
val df = Seq(Record("a", "dup")).toDF
val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
val dupCol = df("dup")
joined.drop(dupCol) // Does not drop anything
joined.drop(func.col("a.dup")) // It drops the column  
joined.select(dupCol) // It selects the column
{code}
 

 

 


> Drop does not work after join with aliases
> --
>
> Key: SPARK-31123
> URL: https://issues.apache.org/jira/browse/SPARK-31123
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.2
>Reporter: Mikel San Vicente
>Priority: Major
>
>  
> Hi,
> I am seeing a really strange behaviour in drop method after a join with 
> aliases. It doesn't seem to find the column when I reference to it using 
> dataframe("columnName") syntax, but it does work with other combinators like 
> select
> {code:java}
> case class Record(a: String, dup: String)
> case class Record2(b: String, dup: String)
> val df = Seq(Record("a", "dup")).toDF
> val df2 = Seq(Record2("a", "dup")).toDF 
> val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
> val dupCol = df("dup")
> joined.drop(dupCol) // Does not drop anything
> joined.drop(func.col("a.dup")) // It drops the column  
> joined.select(dupCol) // It selects the column
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31123) Drop does not work after join with aliases

2020-03-16 Thread Mikel San Vicente (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060046#comment-17060046
 ] 

Mikel San Vicente commented on SPARK-31123:
---

Hi L.C.,

did you perform the join using aliases? otherwise the issue won't happen

> Drop does not work after join with aliases
> --
>
> Key: SPARK-31123
> URL: https://issues.apache.org/jira/browse/SPARK-31123
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.2
>Reporter: Mikel San Vicente
>Priority: Major
>
>  
> Hi,
> I am seeing a really strange behaviour in drop method after a join with 
> aliases. It doesn't seem to find the column when I reference to it using 
> dataframe("columnName") syntax, but it does work with other combinators like 
> select
> {code:java}
> case class Record(a: String, dup: String)
> case class Record2(b: String, dup: String)
> val df = Seq(Record("a", "dup")).toDF
> val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
> val dupCol = df("dup")
> joined.drop(dupCol) // Does not drop anything
> joined.drop(func.col("a.dup")) // It drops the column  
> joined.select(dupCol) // It selects the column
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31123) Drop does not work after join with aliases

2020-03-12 Thread Mikel San Vicente (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-31123:
--
Priority: Major  (was: Minor)

> Drop does not work after join with aliases
> --
>
> Key: SPARK-31123
> URL: https://issues.apache.org/jira/browse/SPARK-31123
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.2
>Reporter: Mikel San Vicente
>Priority: Major
>
>  
> Hi,
> I am seeing a really strange behaviour in drop method after a join with 
> aliases. It doesn't seem to find the column when I reference to it using 
> dataframe("columnName") syntax, but it does work with other combinators like 
> select
> {code:java}
> case class Record(a: String, dup: String)
> case class Record2(b: String, dup: String)
> val df = Seq(Record("a", "dup")).toDF
> val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
> val dupCol = df("dup")
> joined.drop(dupCol) // Does not drop anything
> joined.drop(func.col("a.dup")) // It drops the column  
> joined.select(dupCol) // It selects the column
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31123) Drop does not work after join with aliases

2020-03-11 Thread Mikel San Vicente (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-31123:
--
Description: 
 

Hi,

I am seeing a really strange behaviour in drop method after a join with 
aliases. It doesn't seem to find the column when I reference to it using 
dataframe("columnName") syntax, but it does work with other combinators like 
select
{code:java}
case class Record(a: String, dup: String)
case class Record2(b: String, dup: String)
val df = Seq(Record("a", "dup")).toDF
val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
val dupCol = df("dup")
joined.drop(dupCol) // Does not drop anything
joined.drop(func.col("a.dup")) // It drops the column  
joined.select(dupCol) // It selects the column
{code}
 

 

 

  was:
 

Hi,

I am seeing a really strange behaviour in drop method after a join with 
aliases. It doesn't seem to find the column when I reference to it using 
dataframe("columnName") syntax, but it does work with other combinators like 
select
{code:java}
case class Record(a: String, dup: String)
case class Record2(b: String, dup: String)
val df = Seq(Record("a", "dup")).toDF
val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
val dupCol = df("dup")
joined.drop(dupCol) // Does not drop anything
joined.drop(func.col("a.dup")) // It works!  
joined.select(dupCol) // It works!
{code}
 

 

 


> Drop does not work after join with aliases
> --
>
> Key: SPARK-31123
> URL: https://issues.apache.org/jira/browse/SPARK-31123
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.2
>Reporter: Mikel San Vicente
>Priority: Minor
>
>  
> Hi,
> I am seeing a really strange behaviour in drop method after a join with 
> aliases. It doesn't seem to find the column when I reference to it using 
> dataframe("columnName") syntax, but it does work with other combinators like 
> select
> {code:java}
> case class Record(a: String, dup: String)
> case class Record2(b: String, dup: String)
> val df = Seq(Record("a", "dup")).toDF
> val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
> val dupCol = df("dup")
> joined.drop(dupCol) // Does not drop anything
> joined.drop(func.col("a.dup")) // It drops the column  
> joined.select(dupCol) // It selects the column
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-31123) Drop does not work after join with aliases

2020-03-11 Thread Mikel San Vicente (Jira)
Mikel San Vicente created SPARK-31123:
-

 Summary: Drop does not work after join with aliases
 Key: SPARK-31123
 URL: https://issues.apache.org/jira/browse/SPARK-31123
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.2
Reporter: Mikel San Vicente


 

Hi,

I am seeing a really strange behaviour in drop method after a join with 
aliases. It doesn't seem to find the column when I reference to it using 
dataframe("columnName") syntax, but it does work with other combinators like 
select
{code:java}
case class Record(a: String, dup: String)
case class Record2(b: String, dup: String)
val df = Seq(Record("a", "dup")).toDF
val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
val dupCol = df("dup")
joined.drop(dupCol) // Does not drop anything
joined.drop(func.col("a.dup")) // It works!  
joined.select(dupCol) // It works!
{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-12 Thread Mikel San Vicente (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248999#comment-16248999
 ] 

Mikel San Vicente commented on SPARK-22442:
---

are you planning to make a patch to version 2.2.x with this bug fix?

> Schema generated by Product Encoder doesn't match case class field name when 
> using non-standard characters
> --
>
> Key: SPARK-22442
> URL: https://issues.apache.org/jira/browse/SPARK-22442
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Mikel San Vicente
>Assignee: Liang-Chi Hsieh
> Fix For: 2.3.0
>
>
> Product encoder encodes special characters wrongly when field name contains 
> certain nonstandard characters.
> For example for:
> {quote}
> case class MyType(`field.1`: String, `field 2`: String)
> {quote}
> we will get the following schema
> {quote}
> root
>  |-- field$u002E1: string (nullable = true)
>  |-- field$u00202: string (nullable = true)
> {quote}
> As a consequence of this issue a DataFrame with the correct schema can't be 
> converted to a Dataset using .as[MyType]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-05 Thread Mikel San Vicente (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239821#comment-16239821
 ] 

Mikel San Vicente commented on SPARK-22442:
---

yes, that will work but it wont work for the correct schema that will be 
inferred if you read directly from json:

spark.read.json(path).as[MyType]

it won't work because the inferred schema will be
[field.1: string, field 2: string]

> Schema generated by Product Encoder doesn't match case class field name when 
> using non-standard characters
> --
>
> Key: SPARK-22442
> URL: https://issues.apache.org/jira/browse/SPARK-22442
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Mikel San Vicente
>
> Product encoder encodes special characters wrongly when field name contains 
> certain nonstandard characters.
> For example for:
> {quote}
> case class MyType(`field.1`: String, `field 2`: String)
> {quote}
> we will get the following schema
> {quote}
> root
>  |-- field$u002E1: string (nullable = true)
>  |-- field$u00202: string (nullable = true)
> {quote}
> As a consequence of this issue a DataFrame with the correct schema can't be 
> converted to a Dataset using .as[MyType]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-05 Thread Mikel San Vicente (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-22442:
--
Description: 
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:
{quote}
case class MyType(`field.1`: String, `field 2`: String)
{quote}
we will get the following schema

{quote}
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
{quote}

As a consequence of this issue a DataFrame with the correct schema can't be 
converted to a Dataset using .as[MyType]

  was:
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:
{quote}
case class MyType(`field.1`: String, `field 2`: String)
{quote}
we will get the following schema

{quote}
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
{quote}




> Schema generated by Product Encoder doesn't match case class field name when 
> using non-standard characters
> --
>
> Key: SPARK-22442
> URL: https://issues.apache.org/jira/browse/SPARK-22442
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Mikel San Vicente
>
> Product encoder encodes special characters wrongly when field name contains 
> certain nonstandard characters.
> For example for:
> {quote}
> case class MyType(`field.1`: String, `field 2`: String)
> {quote}
> we will get the following schema
> {quote}
> root
>  |-- field$u002E1: string (nullable = true)
>  |-- field$u00202: string (nullable = true)
> {quote}
> As a consequence of this issue a DataFrame with the correct schema can't be 
> converted to a Dataset using .as[MyType]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-03 Thread Mikel San Vicente (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-22442:
--
Description: 
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:
{quote}
case class MyType(`field.1`: String, `field 2`: String)
{quote}
we will get the following schema

{quote}
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
{quote}



  was:
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:


{{
case class MyType(`field.1`: String, `field 2`: String)
}}
we will get the following schema

{{
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
}}




> Schema generated by Product Encoder doesn't match case class field name when 
> using non-standard characters
> --
>
> Key: SPARK-22442
> URL: https://issues.apache.org/jira/browse/SPARK-22442
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Mikel San Vicente
>Priority: Normal
>
> Product encoder encodes special characters wrongly when field name contains 
> certain nonstandard characters.
> For example for:
> {quote}
> case class MyType(`field.1`: String, `field 2`: String)
> {quote}
> we will get the following schema
> {quote}
> root
>  |-- field$u002E1: string (nullable = true)
>  |-- field$u00202: string (nullable = true)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-03 Thread Mikel San Vicente (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-22442:
--
Description: 
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:


{{case class MyType(`field.1`: String, `field 2`: String)
}}
we will get the following schema

{{root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)}}



  was:
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:

```
case class MyType(`field.1`: String, `field 2`: String)
```
we will get the following schema
```
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
```



> Schema generated by Product Encoder doesn't match case class field name when 
> using non-standard characters
> --
>
> Key: SPARK-22442
> URL: https://issues.apache.org/jira/browse/SPARK-22442
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Mikel San Vicente
>Priority: Normal
>
> Product encoder encodes special characters wrongly when field name contains 
> certain nonstandard characters.
> For example for:
> {{case class MyType(`field.1`: String, `field 2`: String)
> }}
> we will get the following schema
> {{root
>  |-- field$u002E1: string (nullable = true)
>  |-- field$u00202: string (nullable = true)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-03 Thread Mikel San Vicente (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikel San Vicente updated SPARK-22442:
--
Description: 
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:


{{
case class MyType(`field.1`: String, `field 2`: String)
}}
we will get the following schema

{{
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
}}



  was:
Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:


{{case class MyType(`field.1`: String, `field 2`: String)
}}
we will get the following schema

{{root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)}}




> Schema generated by Product Encoder doesn't match case class field name when 
> using non-standard characters
> --
>
> Key: SPARK-22442
> URL: https://issues.apache.org/jira/browse/SPARK-22442
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Mikel San Vicente
>Priority: Normal
>
> Product encoder encodes special characters wrongly when field name contains 
> certain nonstandard characters.
> For example for:
> {{
> case class MyType(`field.1`: String, `field 2`: String)
> }}
> we will get the following schema
> {{
> root
>  |-- field$u002E1: string (nullable = true)
>  |-- field$u00202: string (nullable = true)
> }}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters

2017-11-03 Thread Mikel San Vicente (JIRA)
Mikel San Vicente created SPARK-22442:
-

 Summary: Schema generated by Product Encoder doesn't match case 
class field name when using non-standard characters
 Key: SPARK-22442
 URL: https://issues.apache.org/jira/browse/SPARK-22442
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0, 2.1.2, 2.0.2
Reporter: Mikel San Vicente
Priority: Normal


Product encoder encodes special characters wrongly when field name contains 
certain nonstandard characters.

For example for:

```
case class MyType(`field.1`: String, `field 2`: String)
```
we will get the following schema
```
root
 |-- field$u002E1: string (nullable = true)
 |-- field$u00202: string (nullable = true)
```




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org