Re: [#131]Supporting Hadoop data and cluster management

Steven Jacobs Mon, 18 May 2015 08:34:32 -0700

Great work!
Steven

On Sun, May 17, 2015 at 1:15 PM, Efi <[email protected]> wrote:


> Hello everyone,
>
> This is my update on what I have been doing this last week:
>
> Created an XMLInputFormat java class with the functionalities that Hamza
> described in the issue [1] .The class reads from blocks located in HDFS and
> returns complete items according to a specified xml tag.
> I also tested this class in a standalone hadoop cluster with xml files of
> various sizes, the smallest being a single file of 400 MB and the largest a
> collection of 5 files totalling 6.1 GB.
>
> This week I will create another implementation of the XMLInputFormat with
> a different way of reading and delivering files, the way I described in the
> same issue and I will test both solutions in a standalone and a small
> hadoop cluster (5-6 nodes).
>
> You can see this week's results here [2] .I will keep updating this file
> about the other tests.
>
> Best regards,
> Efi
>
> [1] https://issues.apache.org/jira/browse/VXQUERY-131
> [2]
> https://docs.google.com/spreadsheets/d/1kyIPR7izNMbU8ctIe34rguElaoYiWQmJpAwDb0t9MCw/edit?usp=sharing
>
>

Re: [#131]Supporting Hadoop data and cluster management

Reply via email to