hi!
Casting another int column that is not a partition column fails with the same
error.
The Schema before the cast (column names are anonymized):
root
|-- valueObject: struct (nullable = true)
| |-- value1: string (nullable = true)
| |-- value2: string (nullable = true)
| |-- value3: timestamp (nullable = true)
| |-- value4: string (nullable = true)
|-- partitionColumn2: string (nullable = true)
|-- partitionColumn3: timestamp (nullable = true)
|-- partitionColumn1: integer (nullable = true)
I wanted to cast partitionColumn1 to String which gives me the described error.
Best,
Rico
> Am 17.02.2022 um 09:56 schrieb ayan guha <guha.a...@gmail.com>:
>
> Can you try to cast any other Int field which is NOT a partition column?
>
> On Thu, 17 Feb 2022 at 7:34 pm, Gourav Sengupta <gourav.sengu...@gmail.com>
> wrote:
>> Hi,
>>
>> This appears interesting, casting INT to STRING has never been an issue for
>> me.
>>
>> Can you just help us with the output of : df.printSchema() ?
>>
>> I prefer to use SQL, and the method I use for casting is: CAST(<<column
>> name>> AS STRING) <<alias>>.
>>
>> Regards,
>> Gourav
>>
>>
>>
>>
>>
>>
>> On Thu, Feb 17, 2022 at 6:02 AM Rico Bergmann <i...@ricobergmann.de> wrote:
>>> Here is the code snippet:
>>>
>>> var df = session.read().parquet(basepath);
>>> for(Column partition : partitionColumnsList){
>>> df = df.withColumn(partition.getName(),
>>> df.col(partition.getName()).cast(partition.getType()));
>>> }
>>>
>>> Column is a class containing Schema Information, like for example the name
>>> of the column and the data type of the column.
>>>
>>> Best, Rico.
>>>
>>> > Am 17.02.2022 um 03:17 schrieb Morven Huang <morven.hu...@gmail.com>:
>>> >
>>> > Hi Rico, you have any code snippet? I have no problem casting int to
>>> > string.
>>> >
>>> >> 2022年2月17日 上午12:26,Rico Bergmann <i...@ricobergmann.de> 写道:
>>> >>
>>> >> Hi!
>>> >>
>>> >> I am reading a partitioned dataFrame into spark using automatic type
>>> >> inference for the partition columns. For one partition column the data
>>> >> contains an integer, therefor Spark uses IntegerType for this column. In
>>> >> general this is supposed to be a StringType column. So I tried to cast
>>> >> this column to StringType. But this fails with AnalysisException “cannot
>>> >> cast int to string”.
>>> >>
>>> >> Is this a bug? Or is it really not allowed to cast an int to a string?
>>> >>
>>> >> I’m using Spark 3.1.1
>>> >>
>>> >> Best regards
>>> >>
>>> >> Rico.
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>> >>
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>> >
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> --
> Best Regards,
> Ayan Guha