Great work! Steven On Sun, May 17, 2015 at 1:15 PM, Efi <efika...@gmail.com> wrote:
> Hello everyone, > > This is my update on what I have been doing this last week: > > Created an XMLInputFormat java class with the functionalities that Hamza > described in the issue [1] .The class reads from blocks located in HDFS and > returns complete items according to a specified xml tag. > I also tested this class in a standalone hadoop cluster with xml files of > various sizes, the smallest being a single file of 400 MB and the largest a > collection of 5 files totalling 6.1 GB. > > This week I will create another implementation of the XMLInputFormat with > a different way of reading and delivering files, the way I described in the > same issue and I will test both solutions in a standalone and a small > hadoop cluster (5-6 nodes). > > You can see this week's results here [2] .I will keep updating this file > about the other tests. > > Best regards, > Efi > > [1] https://issues.apache.org/jira/browse/VXQUERY-131 > [2] > https://docs.google.com/spreadsheets/d/1kyIPR7izNMbU8ctIe34rguElaoYiWQmJpAwDb0t9MCw/edit?usp=sharing > >