[ https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235083#comment-16235083 ]
Liang-Chi Hsieh commented on SPARK-22398: ----------------------------------------- [~mgaido], I'd prefer to treat them as integer by default. Because you can easily disable `partitionColumnTypeInference` to read them as string, but if we treat them as string in the inference, you can't make them as integer vice versa by disable the inference. > Partition directories with leading 0s cause wrong results > --------------------------------------------------------- > > Key: SPARK-22398 > URL: https://issues.apache.org/jira/browse/SPARK-22398 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Bogdan Raducanu > Priority: Major > > Repro case: > {code} > spark.range(8).selectExpr("'0' || cast(id as string) as id", "id as > b").write.mode("overwrite").partitionBy("id").parquet("/tmp/bug1") > spark.read.parquet("/tmp/bug1").where("id in ('01')").show > +---+---+ > | b| id| > +---+---+ > +---+---+ > spark.read.parquet("/tmp/bug1").where("id = '01'").show > +---+---+ > | b| id| > +---+---+ > | 1| 1| > +---+---+ > {code} > I think somewhere there is some special handling of this case for equals but > not the same for IN. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org