Github user dgingrich commented on a diff in the pull request:
https://github.com/apache/spark/pull/17227#discussion_r124135474
--- Diff: python/pyspark/sql/types.py ---
@@ -1249,7 +1249,7 @@ def _infer_schema_type(obj, dataType):
}
-def _verify_type(obj, dataType, nullable=True):
+def _verify_type(obj, dataType, nullable=True, name="obj"):
--- End diff --
Set `name=value` in the call at session.py line 516.
It will still print `obj` if the schema is a StructType: `TypeError: obj.a:
MyStructType can not accept object 'a' in type <type 'str'>`. Would you like
to change that too?
Right now changing the default name to None would make the error message
worse: `TypeError: None: IntegerType can not accept object 'a' in type <type
'str'>`.
The best way to make the error message pretty is probably:
- Set the default name to None
- If name==None, don't prepend the `%s: ` string to the error messages
That would make your exmple: `TypeError: IntegerType can not accept object
'a' in type <type 'str'>`.
IMO `obj` is not as pretty but reasonable since it's so simple. Let me
know what you prefer. My only goal is that next time I get a schema failure it
tells me what field to look at :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]