robertwb commented on PR #21762:
URL: https://github.com/apache/beam/pull/21762#issuecomment-1274256193

   I was thinking this could be more useful for other cases where the dtype(s)
   could not be inferred as well, though I can see how that would be sketchy
   as well. Would you prefer we add a new coerce_objects_to_strings operation
   to deferred dataframes? (Or I suppose I could do it manually using
   DataFrame.astype iterating over the columns.) Let me see what that looks
   like.
   
   On Wed, Jun 8, 2022 at 5:11 PM Brian Hulette ***@***.***>
   wrote:
   
   > and convert_dtypes requires looking at the entire PCollection to figure
   > out the proxy object
   >
   > Right, I know we can't do convert_dtypes exactly, I just meant we could
   > add something similar to it, that assumes all object columns are coercible
   > to strings, and raises error at execution time if they're not.
   >
   > I think doing that conversion explicitly with the DataFrame API would be
   > preferable to plumbing the typehint through the schema code. But maybe that
   > uglies things up to have to do read_csv().coerce_objects_to_strings()?
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/beam/pull/21762#issuecomment-1150534792>, or
   > unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AADWVAKDPEKWGVBX75QCXWLVOEZDPANCNFSM5YIF2LIQ>
   > .
   > You are receiving this because you authored the thread.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to