-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21306/#review43466
-----------------------------------------------------------



datafu-pig/src/main/java/datafu/pig/maps/BagToMap.java
<https://reviews.apache.org/r/21306/#comment77626>

    The check for keep last is cheaper than looking up the key in the map, and 
is almost always true as it is the default case. (And it is the same every 
time, which CPUs love.) It might be better to structure this as
    
    if      (keep last) 
      put into map
    else if (! has key)
      put into map
    else if (keep first)
      do nothing
    else 
      throw exception



datafu-pig/src/main/java/datafu/pig/maps/PutToMap.java
<https://reviews.apache.org/r/21306/#comment77630>

    I think this would be better if the signature were PutToMap(map, 
kvs:bag{(key,val)}).
    
    That is cleaner than having to do all the odd/even checks on field counts, 
would let you validate the data on the front end not at execution time, and 
seems the most useful for how I'd think to use this:
    
    -- existing facts about a peep
    peeps = LOAD 'profiles' AS (id:chararray,profile:map);
    -- new facts about a peep. A new fact should clobber an old one with same 
key.
    info  = LOAD 'new_attrs' AS (id:chararray, attr:chararray, val:chararray);
    
    -- collect all new facts about a peep together
    peep_info = FOREACH(GROUP info BY id) GENERATE group AS id, info.(attr,val) 
AS attr_vals;
    
    -- merge every new fact into the existing profile.
    updated = FOREACH (JOIN peeps BY id, info BY id) GENERATE peeps::id AS id, 
PutToMap(profile, attr_vals) AS updated_profile;
    
    (written for clarity even though that only needs one COGROUP)


- Philip (flip) Kromer


On May 11, 2014, 12:42 p.m., wang jian wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21306/
> -----------------------------------------------------------
> 
> (Updated May 11, 2014, 12:42 p.m.)
> 
> 
> Review request for DataFu.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/DATAFU-34
>     
> https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/DATAFU-34
> 
> 
> Repository: datafu
> 
> 
> Description
> -------
> 
> Add UDFs to handle map type data.
> 
> Unit tests done.
> 
> One thing left over: validate the byte array output from map
> 
> 
> Diffs
> -----
> 
>   datafu-pig/src/main/java/datafu/pig/maps/BagToMap.java PRE-CREATION 
>   datafu-pig/src/main/java/datafu/pig/maps/ContainAllKeys.java PRE-CREATION 
>   datafu-pig/src/main/java/datafu/pig/maps/ContainAnyKey.java PRE-CREATION 
>   datafu-pig/src/main/java/datafu/pig/maps/ContainKeys.java PRE-CREATION 
>   datafu-pig/src/main/java/datafu/pig/maps/MapToBag.java PRE-CREATION 
>   datafu-pig/src/main/java/datafu/pig/maps/PutToMap.java PRE-CREATION 
>   datafu-pig/src/test/java/datafu/test/pig/maps/MapTests.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/21306/diff/
> 
> 
> Testing
> -------
> 
> ues
> 
> 
> Thanks,
> 
> wang jian
> 
>

Reply via email to