Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/15264
  
    Hi @sameeragarwal , I just meant if we want to read new and old Parquet 
files (one having int and one having long for the same column) described in the 
JIRA and PR description, we could do one of the followings.
    
    - Read it with the "inferred" schema. ( -1 if this PR tries to deal with 
this)
      We are picking up a single Parquet file to read Spark-side schema. In 
this case, it is ambiguous to decide which one is "new" and "old". So, 
sometimes it'd be failed to read long as int (downcasting) and sometime it'd 
succeed to read int as long (upcasting).
    
    - Read it with the schema from user (I am neutral with this)
      it'd turn into the subset of 
[SPARK-16544](https://issues.apache.org/jira/browse/SPARK-16544). In this case, 
I submitted a PR already https://github.com/apache/spark/pull/14215 but I 
decided to close for a better approach. If this looks good, I'd like to bring 
and re-open my old PR. I guess the approach here is virtually the same with my 
old one.
    
    - Read it with merged schemas. (-1 if this PR tries to deal with this).
      This case would fase 
[SPARK-15516](https://issues.apache.org/jira/browse/SPARK-15516).
    
    If this PR is dealing with only the second approach, we should take care of 
more types. I mean, I guess we don't want multiple PRs and JIRAs for each type, 
each datasources and vecterized/non-vecterized readers. I might be wrong but 
this was what I thought.
    
    BTW, if this change looks okay, I'd try to reopen my previous one rathet 
than making duplicated efforts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to