GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/5303

    [SPARK-6638] [SQL] Improve performance of StringType in SQL

    This PR change the internal representation for StringType from 
java.lang.String to UTF8String, which is implemented use Array[Byte] (encoded 
in UTF-8).
    
    This PR should not break any public API, Row.getString() will still return 
java.lang.String.
    
    This is the first step of improve the performance of String in SQL.
    
    cc @rxin

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark string

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5303.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5303
    
----
commit 685fd071ce453cc6b956f98c897c869ad31702a9
Author: Davies Liu <[email protected]>
Date:   2015-03-31T05:42:07Z

    use UTF8String instead of String for StringType

commit 21f67c6fda3504caa0b13524d4e498c6e4c9c701
Author: Davies Liu <[email protected]>
Date:   2015-03-31T07:50:11Z

    cleanup

commit 4699c3ae1dab6482b26dd3d3739193e68cd77ca3
Author: Davies Liu <[email protected]>
Date:   2015-03-31T20:46:42Z

    use Array[Byte] in UTF8String

commit d32abd1e8e6b7b5ef92a34a5d3a42919db58a43c
Author: Davies Liu <[email protected]>
Date:   2015-03-31T20:57:17Z

    fix utf8 for python api

commit a85fb275d742dd9384e15f22878b545e9a77a106
Author: Davies Liu <[email protected]>
Date:   2015-03-31T23:42:18Z

    refactor

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to