-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16533/#review31404
-----------------------------------------------------------


Thanks Josh, this is a very well written code to make Accumulo available to 
Pig. I can make it work locally. I have two major comment:
1. Bear in mind, Pig is good to manipulate tuple, Ok for map, and horrible for 
bag. A columnar data would be perfect. AccumuloWholeRowStorage does not seems 
pretty useful since it requires user to manipulate bag in Pig script. User will 
have to write a UDF for it. AccumuloKVStorage is fine, but it is also awkward 
to manipulate the flattened data. AccumuloStorage is the best style among these 
three, user can project map entries to get the columnar data. But personally, I 
would still prefer HBastStorage style in which user can specify the column he's 
interested in construct arguments. So I would suggest to improve 
AccumuloStorage (or a new class) to include that.

2. caster is not right. You will need to return correct caster in 
getLoadCaster. Otherwise cast data into Pig type will fail. Eg:
a = load 'accumulo://....' using 
org.apache.pig.backend.hadoop.accumulo.AccumuloStorage() as (key:bytearray, 
m:map[]);
b = foreach a generate (double)m#'gpa';
I see you give Utf8StorageConverter as the converter in some of the loaders, 
but Utf8StorageConverter will assume numerical data in string style, and try to 
parse the string into numerical. I see there is objToBytes in both 
AbstractAccumuloStorage and Utils. These should be consolidated into a caster 
(refer to HBaseBinaryConverter)


build.xml
<https://reviews.apache.org/r/16533/#comment59869>

    Guess we don't need to create a profile for Accumulo. HBase create it 
because they have jar layout change between 94/95. Not the case for Accumulo



src/org/apache/pig/backend/hadoop/accumulo/AbstractAccumuloStorage.java
<https://reviews.apache.org/r/16533/#comment59880>

    Better to ship dependent jars automatically, so user don't have to register 
these jars in Pig script. See HBaseStorage.initializeHBaseClassLoaderResources. 
But that we can do in a follow up ticket.



src/org/apache/pig/backend/hadoop/accumulo/AccumuloStorage.java
<https://reviews.apache.org/r/16533/#comment59871>

    Missing Javadoc for the usage.



src/org/apache/pig/backend/hadoop/accumulo/AccumuloStorage.java
<https://reviews.apache.org/r/16533/#comment59870>

    Boolean construct argument does not work. Pig will always pass String. Need 
to parse the String to boolean in the construct.


- Daniel Dai


On Dec. 31, 2013, 2:02 a.m., Josh Elser wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16533/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2013, 2:02 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-3573
>     https://issues.apache.org/jira/browse/PIG-3573
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Provides basic StoreFunc and LoadFunc implementations. Based off of code that 
> was in an Accumulo contrib project.
> 
> 
> Diffs
> -----
> 
>   build.xml 575c9ae 
>   ivy.xml 180eb2c 
>   ivy/libraries.properties 14abdf8 
>   src/org/apache/pig/backend/hadoop/accumulo/AbstractAccumuloStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/AccumuloKVStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/AccumuloStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/AccumuloWholeRowStorage.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/accumulo/Utils.java PRE-CREATION 
>   
> test/org/apache/pig/backend/hadoop/accumulo/AbstractAccumuloStorageTest.java 
> PRE-CREATION 
>   test/org/apache/pig/backend/hadoop/accumulo/AccumuloKVStorageTest.java 
> PRE-CREATION 
>   test/org/apache/pig/backend/hadoop/accumulo/AccumuloPigClusterTest.java 
> PRE-CREATION 
>   
> test/org/apache/pig/backend/hadoop/accumulo/AccumuloStorageConfigurationTest.java
>  PRE-CREATION 
>   test/org/apache/pig/backend/hadoop/accumulo/AccumuloStorageTest.java 
> PRE-CREATION 
>   
> test/org/apache/pig/backend/hadoop/accumulo/AccumuloWholeRowStorageTest.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/16533/diff/
> 
> 
> Testing
> -------
> 
> Local tests reading, writing and JOIN'ing Accumulo tables. Tested against 
> Hadoop-1.0.4 and 2.2.0, with Accumulo 1.5.0
> 
> 
> Thanks,
> 
> Josh Elser
> 
>

Reply via email to