Github user dwmclary commented on the pull request:

    https://github.com/apache/spark/pull/3213#issuecomment-63145286
  
    Happy to help; these changes should be quick.
    
       - Sure, the wrapper for pyspark makes more sense; I hadn't considered
       that we'd be shipping the objects back and forth to py4j.
       - Jackson should be a simple change; I'll just go hash->json via
       objectmapper if that makes sense.
    
    Cheers,
    Dan
    
    On Fri, Nov 14, 2014 at 2:27 PM, Michael Armbrust <[email protected]>
    wrote:
    
    > Thanks for working on this. I have two high level comments:
    >
    >    - I think it would be better to have a single implementation in Scala
    >    with a wrapper in python. This way we don't have to serialize / ship 
the
    >    objects to python which seems like it might be expensive, especially 
if the
    >    next thing you are going to do is something like saveAsTextFile
    >    - It would also be better to use jackson to do the generation of the
    >    JSON string as there are a lot of tricky edge cases around escaping 
that we
    >    need to handle if we do it ourselves. For example, I think this version
    >    will fail if a column name contains a quote character.
    >
    > /cc @yhuai <https://github.com/yhuai>
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/3213#issuecomment-63139261>.
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to