[ 
https://issues.apache.org/jira/browse/SPARK-24943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414535#comment-17414535
 ] 

Varun Bharill commented on SPARK-24943:
---------------------------------------

Hi [~hyukjin.kwon], 
Thank you for your response. The underlying scala APIs seem to support char, 
varchar, and unions. Below is what I tried on spark-shell.

 
{code:java}
scala> val ddlSchemaStr = "fullName varchar(10),age char(10),gender 
uniontype<int, string>"
ddlSchemaStr: String = fullName varchar(10),age char(10),gender uniontype<int, 
string>
scala> val ddlSchema = StructType.fromDDL(ddlSchemaStr)
ddlSchema: org.apache.spark.sql.types.StructType = 
StructType(StructField(fullName,StringType,true), 
StructField(age,StringType,true), 
StructField(gender,StructType(StructField(tag_0,IntegerType,true), 
StructField(tag_1,StringType,true)),true))
{code}
 

 

However, when I call the same scala/Java API in python, I am able to parse char 
and varchar but union is still failing. 

 
{code:java}
json_obj = 
SparkContext._active_spark_context._jvm.org.apache.spark.sql.types.StructType.fromDDL("fullName
 varchar(10),age char(10),gender uniontype<int, string>")


Traceback (most recent call last):
 File 
"/Users/vbharill/PycharmProjects/scripts/outlawenv/lib/python3.8/site-packages/pyspark/sql/utils.py",
 line 117, in deco
 raise converted from None
pyspark.sql.utils.ParseException: 
mismatched input '<' expecting {<EOF>, '(', ',', 'COMMENT', NOT}(line 1, pos 50)
== SQL ==
fullName varchar(10),age char(10),gender uniontype<int, string>
--------------------------------------------------^^^
{code}
 

Could you clarify why there is a discrepancy between the two even though both 
are the same APIs? I am also okay with using the second API and then convert 
the result to json and then parse it is as a json object, only that it is still 
not able to parse uniontype. 

> Convert a SQL Struct to StructType
> ----------------------------------
>
>                 Key: SPARK-24943
>                 URL: https://issues.apache.org/jira/browse/SPARK-24943
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: mahmoud mehdi
>            Priority: Minor
>             Fix For: 2.4.0
>
>
> The main goal of this User Story is to add a method to StructType which does 
> the opposite to what does the sql method.
> For example, for the following SQL Struct : 
> {code:java}
> df.schema.sql
> //STRUCT<`price`: STRUCT<`amount`: BIGINT, `currency`: STRING>>{code}
>  We'll have the following output : 
> {code:java}
> StructType.fromSql(df.schema.sql)
> //StructType(StructField(price,StructType(StructField(amount,LongType,true), 
> //StructField(currency,StringType,true)),true))
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to