GitHub user james64 opened a pull request:

    https://github.com/apache/spark/pull/2712

    [SPARK-3121] Wrong implementation of implicit bytesWritableConverter

    val path = ... //path to seq file with BytesWritable as type of both key 
and value
    val file = sc.sequenceFile[Array[Byte],Array[Byte]](path)
    file.take(1)(0)._1
    
    This prints incorrect content of byte array. Actual content starts with 
correct one and some "random" bytes and zeros are appended. BytesWritable has 
two methods:
    
    getBytes() - return content of all internal array which is often longer 
then actual value stored. It usually contains the rest of previous longer values
    
    copyBytes() - return just begining of internal array determined by internal 
length property
    
    It looks like in implicit conversion between BytesWritable and Array[byte] 
getBytes is used instead of correct copyBytes.
    
    @dbtsai


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/james64/spark 3121-bugfix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2712.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2712
    
----
commit 480f9cdaf69254dd429b949d9ccc6d0b2c617ad0
Author: Dubovsky Jakub <dubov...@avast.com>
Date:   2014-10-08T13:49:41Z

    Bug 3121 fixed

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to