[jira] [Commented] (SPARK-24977) input_file_name() result can't save and use for partitionBy()

2018-07-31 Thread kevin yu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564490#comment-16564490
 ] 

kevin yu commented on SPARK-24977:
--

Hello Srinivasarao: Can you show the steps you encountered the problem? I just 
did a quick test, seems work fine, but not sure it is the same as yours.

 

scala> spark.read.textFile("file:///etc/passwd")

res3: org.apache.spark.sql.Dataset[String] = [value: string]

scala> res3.select(input_file_name() as "input", expr("10 as 
col2")).write.partitionBy("input").saveAsTable("passwd3")

18/07/31 16:11:59 WARN ObjectStore: Failed to get database global_temp, 
returning NoSuchObjectException

 

scala> spark.sql("select * from passwd3").show

++--+

|col2|             input|

++--+

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

|  10|file:///etc/passwd|

++--+

only showing top 20 rows

 

> input_file_name() result can't save and use for partitionBy()
> -
>
> Key: SPARK-24977
> URL: https://issues.apache.org/jira/browse/SPARK-24977
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, Spark Core, SQL
>Affects Versions: 2.3.1
>Reporter: Srinivasarao Padala
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24977) input_file_name() result can't save and use for partitionBy()

2018-07-31 Thread Srinivasarao Padala (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563424#comment-16563424
 ] 

Srinivasarao Padala commented on SPARK-24977:
-

i could able to see the the filenames by df.show() , but aftersaving to file , 
it is showing empty and

not able to use for partitionBy() while saving . getting Error - 

: java.lang.AssertionError: assertion failed: Empty partition column value in 
'filename='

> input_file_name() result can't save and use for partitionBy()
> -
>
> Key: SPARK-24977
> URL: https://issues.apache.org/jira/browse/SPARK-24977
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, Spark Core, SQL
>Affects Versions: 2.3.1
>Reporter: Srinivasarao Padala
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org