Store the entire xml in one line...

On 10/29/09, Steve Gao <[email protected]> wrote:
> Does anybody have the similar issue? If you store XML files in HDFS, how can
> you make sure a chunk reads by a mapper does not contain partical data of an
> XML segment?
>
> For example:
>
> <title>
> <book>book1</book>
> <author>me</author>
> ..............what if this is the boundary of a chunk?...................
> <year>2009</year>
> <book>book2</book>
>
> <author>me</author>
>
> <year>2009</year>
> <book>book3</book>
>
> <author>me</author>
>
> <year>2009</year>
> <title>
>
>
>
>


-- 


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Reply via email to