Yes, we use HashMap in 0.8.1. In 0.9, we are using ArrayList, so you might see fewer issues like this.
Daniel 2011/8/23 lulynn_2008 <[email protected]>: > Hello, > I have some opinion about pig commands implementation procedure: > For example: > pig commands(from TestNewPlanLogToPhyTranslationVisitor.java): > a = load 'd1.txt' as (id, c); > b = load 'd2.txt'as (id, c); > c = load 'd3.txt' as (id, c); > d = join a by id, b by c; > e = filter d by a::id==NULL AND b::c==NULL; > f = join e by b::c, c by id; > g = filter f by b::id==NULL AND c::c==NULL; > store g into 'empty2'; > Pig will use buildPlan method to get LogicalPlan like this: > | > |---g: Filter scope-24 Schema: {e::a::id: bytearray,e::a::c: > bytearray,e::b::id: bytearray,e::b::c: bytearray,c::id: bytearray,c::c: > bytearray} Type: bag > | | > | And scope-23 FieldSchema: boolean Type: boolean > | | > | |---Equal scope-19 FieldSchema: boolean Type: boolean > | | | > | | |---Project scope-17 Projections: [2] Overloaded: false > FieldSchema: e::b::id: bytearray Type: bytearray > | | | Input: f: LOJoin scope-16 > | | | > | | |---Const scope-18( null ) FieldSchema: bytearray Type: bytearray > | | > | |---Equal scope-22 FieldSchema: boolean Type: boolean > | | > | |---Project scope-20 Projections: [5] Overloaded: false > FieldSchema: c::c: bytearray Type: bytearray > | | Input: f: LOJoin scope-16 > | | > | |---Const scope-21( null ) FieldSchema: bytearray Type: bytearray > | > |---f: LOJoin scope-16 Schema: {e::a::id: bytearray,e::a::c: > bytearray,e::b::id: bytearray,e::b::c: bytearray,c::id: bytearray,c::c: > bytearray} Type: bag > | | > | Project scope-14 Projections: [3] Overloaded: false FieldSchema: > b::c: bytearray Type: bytearray > | Input: e: Filter scope-13 > | | > | Project scope-15 Projections: [0] Overloaded: false FieldSchema: > id: bytearray Type: bytearray > | Input: c: Load scope-2 > | > |---c: Load scope-2 Schema: {id: bytearray,c: bytearray} Type: bag > | > |---e: Filter scope-13 Schema: {a::id: bytearray,a::c: > bytearray,b::id: bytearray,b::c: bytearray} Type: bag > | | > | And scope-12 FieldSchema: boolean Type: boolean > | | > | |---Equal scope-8 FieldSchema: boolean Type: boolean > | | | > | | |---Project scope-6 Projections: [0] Overloaded: false > FieldSchema: a::id: bytearray Type: bytearray > | | | Input: d: LOJoin scope-5 > | | | > | | |---Const scope-7( null ) FieldSchema: bytearray Type: > bytearray > | | > | |---Equal scope-11 FieldSchema: boolean Type: boolean > | | > | |---Project scope-9 Projections: [3] Overloaded: false > FieldSchema: b::c: bytearray Type: bytearray > | | Input: d: LOJoin scope-5 > | | > | |---Const scope-10( null ) FieldSchema: bytearray Type: > bytearray > | > |---d: LOJoin scope-5 Schema: {a::id: bytearray,a::c: > bytearray,b::id: bytearray,b::c: bytearray} Type: bag > | | > | Project scope-3 Projections: [0] Overloaded: false > FieldSchema: id: bytearray Type: bytearray > | Input: a: Load scope-0 > | | > | Project scope-4 Projections: [1] Overloaded: false > FieldSchema: c: bytearray Type: bytearray > | Input: b: Load scope-1 > | > |---a: Load scope-0 Schema: {id: bytearray,c: bytearray} Type: > bag > | > |---b: Load scope-1 Schema: {id: bytearray,c: bytearray} Type: > bag > > I assume the commands analysis and middle data storage are all based on > HashMap structure. Is this correct? > I found some test cases result are based on the result of HashMap analysis. > Then in my opinion, our test case output result should not be single. As we > know the output of HashMap analysis is not steadfast. Please give your > opinion about my words. Thank you. > > >
