Yes, we use HashMap in 0.8.1. In 0.9, we are using ArrayList, so you
might see fewer issues like this.

Daniel

2011/8/23 lulynn_2008 <[email protected]>:
>  Hello,
> I have some opinion about pig commands implementation procedure:
> For example:
> pig commands(from TestNewPlanLogToPhyTranslationVisitor.java):
>        a = load 'd1.txt' as (id, c);
>        b = load 'd2.txt'as (id, c);
>        c = load 'd3.txt' as (id, c);
>        d = join a by id, b by c;
>        e = filter d by a::id==NULL AND b::c==NULL;
>        f = join e by b::c, c by id;
>        g = filter f by b::id==NULL AND c::c==NULL;
>        store g into 'empty2';
> Pig will use buildPlan method to get LogicalPlan like this:
> |
> |---g: Filter scope-24 Schema: {e::a::id: bytearray,e::a::c: 
> bytearray,e::b::id: bytearray,e::b::c: bytearray,c::id: bytearray,c::c: 
> bytearray} Type: bag
>    |   |
>    |   And scope-23 FieldSchema: boolean Type: boolean
>    |   |
>    |   |---Equal scope-19 FieldSchema: boolean Type: boolean
>    |   |   |
>    |   |   |---Project scope-17 Projections: [2] Overloaded: false 
> FieldSchema: e::b::id: bytearray Type: bytearray
>    |   |   |   Input: f: LOJoin scope-16
>    |   |   |
>    |   |   |---Const scope-18( null ) FieldSchema: bytearray Type: bytearray
>    |   |
>    |   |---Equal scope-22 FieldSchema: boolean Type: boolean
>    |       |
>    |       |---Project scope-20 Projections: [5] Overloaded: false 
> FieldSchema: c::c: bytearray Type: bytearray
>    |       |   Input: f: LOJoin scope-16
>    |       |
>    |       |---Const scope-21( null ) FieldSchema: bytearray Type: bytearray
>    |
>    |---f: LOJoin scope-16 Schema: {e::a::id: bytearray,e::a::c: 
> bytearray,e::b::id: bytearray,e::b::c: bytearray,c::id: bytearray,c::c: 
> bytearray} Type: bag
>        |   |
>        |   Project scope-14 Projections: [3] Overloaded: false FieldSchema: 
> b::c: bytearray Type: bytearray
>        |   Input: e: Filter scope-13
>        |   |
>        |   Project scope-15 Projections: [0] Overloaded: false FieldSchema: 
> id: bytearray Type: bytearray
>        |   Input: c: Load scope-2
>        |
>        |---c: Load scope-2 Schema: {id: bytearray,c: bytearray} Type: bag
>        |
>        |---e: Filter scope-13 Schema: {a::id: bytearray,a::c: 
> bytearray,b::id: bytearray,b::c: bytearray} Type: bag
>            |   |
>            |   And scope-12 FieldSchema: boolean Type: boolean
>            |   |
>            |   |---Equal scope-8 FieldSchema: boolean Type: boolean
>            |   |   |
>            |   |   |---Project scope-6 Projections: [0] Overloaded: false 
> FieldSchema: a::id: bytearray Type: bytearray
>            |   |   |   Input: d: LOJoin scope-5
>            |   |   |
>            |   |   |---Const scope-7( null ) FieldSchema: bytearray Type: 
> bytearray
>            |   |
>            |   |---Equal scope-11 FieldSchema: boolean Type: boolean
>            |       |
>            |       |---Project scope-9 Projections: [3] Overloaded: false 
> FieldSchema: b::c: bytearray Type: bytearray
>            |       |   Input: d: LOJoin scope-5
>            |       |
>            |       |---Const scope-10( null ) FieldSchema: bytearray Type: 
> bytearray
>            |
>            |---d: LOJoin scope-5 Schema: {a::id: bytearray,a::c: 
> bytearray,b::id: bytearray,b::c: bytearray} Type: bag
>                |   |
>                |   Project scope-3 Projections: [0] Overloaded: false 
> FieldSchema: id: bytearray Type: bytearray
>                |   Input: a: Load scope-0
>                |   |
>                |   Project scope-4 Projections: [1] Overloaded: false 
> FieldSchema: c: bytearray Type: bytearray
>                |   Input: b: Load scope-1
>                |
>                |---a: Load scope-0 Schema: {id: bytearray,c: bytearray} Type: 
> bag
>                |
>                |---b: Load scope-1 Schema: {id: bytearray,c: bytearray} Type: 
> bag
>
> I assume the commands analysis and middle data storage are all based on 
> HashMap structure. Is this correct?
> I found some test cases result are based on the result of HashMap analysis. 
> Then in my opinion, our test case output result should not be single. As we 
> know the output of HashMap analysis is not  steadfast. Please give your 
> opinion about my words. Thank you.
>
>
>

Reply via email to