Github user MLnick commented on the pull request:
https://github.com/apache/spark/pull/1338#issuecomment-48820160
I have had a quick look over and will try to do a more detailed one this
weekend.
High level looks good, 2 comments so far:
1. Agree with Matei that I think the tests should live in `tests.py` as
opposed to docstrings, and add tests for other datatypes in a similar manner to
the input format tests
2. Would be great to add a couple of examples of using the custom
`Converter` in reverse for output. Again, a Cassandra and HBase example in
similar vein to the input format examples would be valuable I think
Will provide any more feedback as I go through it in more detail.
(btw thanks for fixing up the `ArrayWritable` stuff too).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---